Official Code Repository for "Learning Encoding-Decoding Direction Pairs to Unveil Concepts of Influence in Deep Vision Networks" (Accepted at TMLR)
This repository provides the core implementation for our work on recovery of the encoding-decoding mechanism of concepts in deep vision models.
The codebase is modularly split into core utility components (/comp) and training recipes/pipelines (/litnn), allowing for clear reproduction of both synthetic and real-world experimental setups.
We have organized the code into two main directories to clearly separate reusable building blocks from the primary training workflows.
This directory contains modular, foundational utilities that can be reused across different parts of the research (e.g., data loading, metric calculation, parameter handling).
| File | Description | Purpose |
|---|---|---|
basis.py |
Uninterpretable Basis Extraction (UIBE) / EDDP Core | Handles core implementation components for both UIBE and the general EDDP framework. |
cache.py |
GPU Cache Utilities | Helper class to efficiently load and manage small datasets directly into GPU memory, accelerating training. |
datagen.py |
Synthetic Data Generation | Contains functions and classes used to generate controlled synthetic concept representations based on linear representation hypotheses. |
dataset.py |
Dataset Implementations | Defines the core datasets, including image class and concept datasets and the helper MapDataset. |
eval.py |
Evaluation Metrics | Implements specific evaluation metrics: Direction labeling using IoU and signal value regression. |
filepath.py |
Utility Functions | Helper functions for reliably constructing artifact filepaths (ensuring consistent data handling). |
model.py |
Classifier Module | Defines the simple classifier head that operates as the top layer of a potential large network. |
param.py |
Parameterization Utilities | Parametrizations for positive scalars (Separation Margin) and hyper-sphere parametrization for unit-norm vectors |
utils.py |
General Helpers | Simple utility functions for general training loop management and evaluation within the Lit module framework. |
This directory contains complete, executable scripts that represent specific research pipelines or experiments.
| File | Description | Core Functionality |
|---|---|---|
model_lm.py |
Classifier Training | Trains the classifier to accurately distinguish synthetic image representations based on their intended concept content. |
uibe_lm.py |
UIBE Pipeline | Executes Unsupervised Interpretable Basis Extraction (UIBE), incorporating Augmented Lagrangian Loss and optional Uncertainty Region Alignment for robust basis learning. |
eddp_lm.py |
EDDP Pipeline (Core) | Learns the Encoding-Decoding Direction Pairs, integrating both Uncertainty Region Alignment and Augmented Lagrangian Loss to optimize concept influence of directions. |
eddp-synthetic.ipynb: The dedicated Jupyter Notebook providing the reproducible code for all experimental results on synthetic data, exactly matching the setups detailed in the paper.
- Requirements: Python 3.11. &
pip install -r requirements.txt - Execution: The UIBE and EDDP lit modules can be used to learn direction pairs for any real-world network. You will need to create the appropriate datamodule/dataset class that provides feature representations of the network's given layer. The dimensionality of tensor input is
batch X channels X height X width.