rachtibat / zennit-crp Goto Github PK

An eXplainable AI toolkit with Concept Relevance Propagation and Relevance Maximization

Home Page: https://www.nature.com/articles/s42256-023-00711-8

License: Other

Python 2.75% Jupyter Notebook 97.25%

activation-maximization attribution crp deep-learning explainable-ai interpretable-machine-learning lrp relevance-maximization zennit

zennit-crp's People

Contributors

Stargazers

Watchers

Forkers

maxdreyer frederikpahde pio2398 leonbai oncowycho neural-data-science-lab zpyuan6 atxxl paula-kli xueyang6 prabal5ghosh lowlorenz isega24

zennit-crp's Issues

"max" Maximization Target Resulting in RuntimeError

Hi all,

setting the maximization target to "max" when performing the "run" method of FeatureVisualization results in a RuntimeError.

My code:

fv = FeatureVisualization(attribution, imagenet_data, layer_map, preprocess_fn=preprocessing, path="VGG16_ImageNet_MAX", max_target="max")

saved_files = fv.run(composite, 0, 1000, 32, 100)

Error:

File ".../zennit-crp/crp/concepts.py", line 107, in reference_sampling
    rel_l = torch.gather(rel_l, 0, rf_neuron)
RuntimeError: Index tensor must have the same number of dimensions as input tensor

as rel_l has shape (batch_size, channel_number, neurons_number) and rf_neuron has shape (batch_size, channel_number).

This could be fixed by first expanding the dimensions of rf_neuron and thereafter reducing the dimension again.
And further, by setting the dimension about which to gather to the 2nd dimension.

rel_l = torch.gather(rel_l, 2, rf_neuron.unsqueeze(2)).squeeze(2)

Update to new zennit version

The new zennit version is incompatible with zennit-crp.

AttributeError: module 'zennit' has no attribute '...'

Issue: Common Import Problem with CRP and Zennit 0.4.6

Background

The CRP package currently enforces the use of zennit==0.4.6, while the latest version of zennit is 0.5.1. Users following the latest Zennit documentation may encounter import errors due to changes in how packages are loaded.

Problem

In the update from Zennit 0.4.6 to 0.4.7, there was a commit that modified the __init__.py file to import all modules of the Zennit package into the Zennit namespace. This change allows for importing modules like ResNetCanonizer directly:

import zennit
canonizer = zennit.torchvision.ResNetCanonizer()

However, in version 0.4.6, which CRP currently enforces, this import leads to the following error:

import zennit
canonizer = zennit.torchvision.ResNetCanonizer()
>>> Traceback (most recent call last):
>>>   File "<stdin>", line 1, in <module>
>>> AttributeError: module 'zennit' has no attribute 'torchvision'

This error occurs because the torchvision module is not loaded in Zennit's __init__.py in version 0.4.6.

Solution

To prevent this error, explicitly import the zennit.torchvision module as follows:

import zennit.torchvision
canonizer = zennit.torchvision.ResNetCanonizer()

For more information on Python namespaces and module imports, refer to the Python tutorial on namespaces.

Suggestions for Resolution

Update CRP Requirements: Consider updating CRP to require zennit>=0.4.7 to ensure compatibility with the latest Zennit practices and documentation.
Documentation Update: Add a note in the CRP documentation to inform users about this import issue and the temporary workaround until the package requirements are updated.

@rachtibat

ReceptiveField calculation misses the last neurons of a layer

Hi all,

when I viewed the top-k most relevant samples of a concept with receptive field "on", I got as a result less than k samples.

I spotted a bug in the method analyze_layer of the ReceptiveField class:

def analyze_layer(self, concept: Concept, layer_name: str, c_indices, canonizer=None, batch_size=16, verbose=True):

    composite = AllFlatComposite(canonizer)
    conditions = [{layer_name: [index]} for index in c_indices]

    batch = 0
    for attr in self.attribution.generate(
            self.single_sample, conditions, composite, [], concept.mask_rf,
            layer_name, 1, batch_size, None, verbose):

        heat = self.norm_rf(attr.heatmap, layer_name)

        try:
            rf_array[batch * len(heat): (batch+1) * len(heat)] = heat
        except UnboundLocalError:
            rf_array = torch.zeros((len(c_indices), *heat.shape[1:]), dtype=torch.uint8)
            rf_array[batch * len(heat): (batch+1) * len(heat)] = heat

        batch += 1

    return rf_array

The error occurs in the line

rf_array[batch * len(heat): (batch+1) * len(heat)] = heat

For the last batch, len(heat) might be smaller than the batch size. Then, the indexing (supposedly beginning at the position of the last neuron index) is not correct anymore.

Possible solution: counting len(heat) for every batch with an index and using the index instead of batch*len(heat).

Best,
Max

Initialize CRP with Relevance instead of Activation for RelMax reference samples

We know that activation != relevance. To localize concepts in input space, we initialize CRP with channel activations using the start_layer argument of the CondAttribution class. This works well for ActMax and is highly efficient. But for RelMax, the heatmap could be inverted or missing parts. Thus, in a future version of zennit-crp, we will perform localization of RelMax samples with a complete backward pass beginning at the output of the model to utilize the intermediate relevances i.e. the condition set = [{layer: channel, y:class}] and not start_layer argument. This is computationally less efficient, but results in better localization.

Incorrect layer name from path

Hi all,

in maximization.py in line 110 of method collect_results, the layer name is extracted from the path as

for path in path_list:
    filename = path.split("/")[-1]
    l_name = filename.split("_")[0]

Here, the ending of filename consists of a layer's name and some numbers (which are separated by " _ ").
This, however, assumes that the layer's name does not consist of " _ ", which can happen. In this case, the analysis does not work.

RelStats and ActStats use wrong (unsorted) targets in FeatureVisualization method

Hi all,

I think the result of the methods analyze_relevance and analyze_activation from the FeatureVisualization class are not as expected.

Let us take the method (line 162 in visualization.py):

def analyze_relevance(self, rel, layer_name, concept, data_indices, targets):
    """
    Finds input samples that maximally activate each neuron in a layer and most relevant samples
    """

    d_c_sorted, rel_c_sorted, rf_c_sorted = self.RelMax.analyze_layer(rel, concept, layer_name, data_indices)
    
    self.RelStats.analyze_layer(d_c_sorted, rel_c_sorted, rf_c_sorted, layer_name, targets)

Here, the relevance values (identified by rel_c_sorted) of each channel are sorted in descending fashion. Therefore, rel_c_sorted has shape (batch_size, num_channels). However, for each channel, the values might be sorted differently (in different order) compared to another channel.

Afterwards, in self.RelStats.analyze_layer(d_c_sorted, rel_c_sorted, rf_c_sorted, layer_name, targets), however, the values are assumed to not have been sorted:

def analyze_layer(self, d_c_sorted, rel_c_sorted, rf_c_sorted, layer_name, targets):

    t_unique = np.unique(targets)
    for t in t_unique:

        t_indices = np.where(targets == t)[0]

        d_c_t = d_c_sorted[t_indices]
        rel_c_t = rel_c_sorted[t_indices]
        rf_c_t = rf_c_sorted[t_indices]

        self.concatenate_with_results(layer_name, t, d_c_t, rel_c_t, rf_c_t)
        self.sort_result_array(layer_name, t)

Here, ultimately, the highest channel values are associated with the first target/value in t_unique, which is often not true.

In order to fix the behavior, we should take into account the actual target associated to each channel value. Maybe something like:

def analyze_relevance(self, rel, layer_name, concept, data_indices, targets):
    d_c_sorted, rel_c_sorted, rf_c_sorted, argsort = self.RelMax.analyze_layer(rel, concept, layer_name, data_indices)
    
    targets = torch.take(torch.Tensor(targets).to(argsort), argsort)
    self.RelStats.analyze_layer(d_c_sorted, rel_c_sorted, rf_c_sorted, layer_name, targets)

where argsort describes the sorting order and targets becomes a tensor of shape (batch_size, num_channels) indicating for each channel value the true target class. Method ** self.RelStats.analyze_layer** then needs to handle the different target shape.

run_distributed method does not consider batch size

Hi @rachtibat,

the run_distributed method of the FeatureVisualization class does not take into account the actual batch_size for the multi-target case.

Maybe include something like:

if n_samples > batch_size:
    batches_ = math.ceil(len(conditions) / batch_size)
else:
    batches_ = 1

for b_ in range(batches_):
    data_broadcast_ = data_broadcast[b_ * batch_size: (b_ + 1) * batch_size]
    # print(len(conditions), len(data_broadcast_))
    conditions_ = conditions[b_ * batch_size: (b_ + 1) * batch_size]
    # dict_inputs is linked to FeatHooks
    dict_inputs["sample_indices"] = sample_indices[b_ * batch_size: (b_ + 1) * batch_size]
    dict_inputs["targets"] = targets[b_ * batch_size: (b_ + 1) * batch_size]

# composites are already registered before
    self.attribution(data_broadcast_, conditions_, None, exclude_parallel=False)

This would fix some GPU memory issue of mine.

Best,
Max

Running the analysis in tutorial notebook fails

Hi, today I was trying to run another tutorial notebook and got an error in notebook feature_visualization.ipynb after uncommenting the line:

#saved_files = fv.run(composite, 0, len(imagenet_data), 32, 100)

I have changed the parameters of from (composite, 0, len(imagenet_data), 32, 100) to (composite, 0, 10, 2, 1) to check if it works (and it does not...?).

The error is:

RuntimeError Traceback (most recent call last)
t:\studies\py_projects\milab-internship\clean-zennit-crp\tutorials\feature_visualization.ipynb Cell 11 in <cell line: 2>()
1 # it will take approximately 25 min on a Titan RXT
----> 2 saved_files = fv.run(composite, 0, 10, 2, 1)

File t:\win_programs\python_venvs\ml_39_torch_171\lib\site-packages\crp\visualization.py:76, in FeatureVisualization.run(self, composite, data_start, data_end, batch_size, checkpoint, on_device)
73 saved_checkpoints = self.run_distributed(composite, data_start, data_end, batch_size, checkpoint, on_device)
75 print("Collecting results...")
---> 76 saved_files = self.collect_results(saved_checkpoints)
78 return saved_files

File t:\win_programs\python_venvs\ml_39_torch_171\lib\site-packages\crp\visualization.py:197, in FeatureVisualization.collect_results(self, checkpoints, d_index)
193 def collect_results(self, checkpoints: Dict[str, List[str]], d_index: Tuple[int, int] = None):
195 saved_files = {}
--> 197 saved_files["r_max"] = self.RelMax.collect_results(checkpoints["r_max"], d_index)
198 saved_files["a_max"] = self.ActMax.collect_results(checkpoints["a_max"], d_index)
199 saved_files["r_stats"] = self.RelStats.collect_results(checkpoints["r_stats"], d_index)

File t:\win_programs\python_venvs\ml_39_torch_171\lib\site-packages\crp\maximization.py:120, in Maximization.collect_results(self, path_list, d_index)
116 rel_c_sorted = np.load(path + "rel.npy")
118 d_c_sorted, rf_c_sorted, rel_c_sorted = map(torch.from_numpy, [d_c_sorted, rf_c_sorted, rel_c_sorted])
--> 120 self.concatenate_with_results(l_name, d_c_sorted, rel_c_sorted, rf_c_sorted)
121 self.sort_result_array(l_name)
123 pbar.update(1)

File t:\win_programs\python_venvs\ml_39_torch_171\lib\site-packages\crp\maximization.py:70, in Maximization.concatenate_with_results(self, layer_name, d_c_sorted, rel_c_sorted, rf_c_sorted)
67 self.rf_c_sorted[layer_name] = rf_c_sorted
69 else:
---> 70 self.d_c_sorted[layer_name] = torch.cat([d_c_sorted, self.d_c_sorted[layer_name]])
71 self.rel_c_sorted[layer_name] = torch.cat([rel_c_sorted, self.rel_c_sorted[layer_name]])
72 self.rf_c_sorted[layer_name] = torch.cat([rf_c_sorted, self.rf_c_sorted[layer_name]])

RuntimeError: Sizes of tensors must match except in dimension 0. Got 4096 and 1000 in dimension 1 (The offending index is 1)

The rest of this tutorial notebook works fine, however my goal is to explain a different model. When trying to run this line on my ResNet18 that classifies images to 18 classes, I get a similar error saying it got 512 and 18 in dimension 1.

System details:

Platform: Windows 11
Python: Python 3.9.13
Pytorch: torch==1.7.1+cu110
zennit: zennit==0.4.6

Also tried with:

Python 3.10 with torch 1.12.0 and CUDA 11.6 (on Windows),
with same results.

Relevance of weight values

Hi,

Thanks for the awesome work! Is there any way to extend relevance scores for activations in each layer to their corresponding weight values?

Target list needs to be a numpy array for multi-target datasets

Hi all,

for a multi-target dataset, I get the following error:

File ".../crp/visualization.py", line 180, in analyze_activation
    targets = targets[unique_indices]
TypeError: only integer scalar arrays can be converted to a scalar index

This is resulting as the list targets need to be a numpy array. This is the case for single-target datasets, where the conversion from List to numpy array is taking place in line 118 of ".../crp/visualization.py":

targets_samples = np.array(targets_samples)  # numpy operation needed

Therefore, we also have to add a statement such as

targets = np.array(targets)

after line 133 of ".../crp/visualization.py".

`layer_map` variable in Code Example

Hey!
is it possible that in your Readme in the Feature Visualization Part the line layer_map = {(name, cc) for name in layer_names} should actually be layer_map = {name: cc for name in layer_names}?

In crp/visualization.py in line 103 the code wants to iterate over self.layer_map.items() which is not available if this is used as a set. If you change the code as depicted above it becomes a dictionary and on dictionaries .items()is available.

Did I see that correctly or did I miss something?

Thanks!

Conditional Heatmaps ignore parallel connections in, e.g., Resnets

When defining for example the condition set
[{"features.40": [0, 2]}]
the channels 0 and 2 are passed through and all other channels are masked with zero.
However, in models with several parallel connections (shortcuts), the parallel connections are not set to zero and relevance is passed through.

In future, all concepts in parallel layers should also be set to zero, to get the sole contribution of the masked concept.

Tutorial notebook not working

Hi, I am trying to run code from tutorial notebooks. I got stuck in notebook attributions.ipynb at line:

attr = attribution(sample, conditions, composite, mask_map=cc.mask)

with an error saying :

TypeError Traceback (most recent call last)
t:\studies\py_projects\milab-internship\clean-zennit-crp\tutorials\attributions.ipynb Cell 20 in <cell line: 8>()
6 # zennit requires gradients
7 sample.requires_grad = True
----> 8 attr = attribution(sample, conditions, composite, mask_map=cc.mask)
10 # or use a dictionary for mask_map
11 layer_names = get_layer_names(model, [torch.nn.Conv2d, torch.nn.Linear])

File t:\win_programs\python_venvs\ml_39_torch_171\lib\site-packages\crp\attribution.py:154, in CondAttribution.call(self, data, conditions, composite, record_layer, mask_map, start_layer, init_rel, on_device)
152 else:
153 pred = modified(data)
--> 154 self.backward_initialization(pred, y_targets, init_rel, self.MODEL_OUTPUT_NAME)
156 attribution = self.attribution_modifier(data)
157 activations, relevances = {}, {}

File t:\win_programs\python_venvs\ml_39_torch_171\lib\site-packages\crp\attribution.py:64, in CondAttribution.backward_initialization(self, prediction, target_list, init_rel, layer_name, retain_graph)
61 mask[i, targets] = output_selection[i, targets]
62 output_selection = mask
---> 64 torch.autograd.backward((prediction,), (output_selection.to(prediction),),
65 retain_graph=retain_graph)
...
[too long to paste here]
...
40 New Tensor copied from input with values shifted by epsilon.
41 '''
---> 42 return input + ((input == 0.).to(input) + input.sign()) * epsilon

TypeError: only integer tensors of a single element can be converted to an index

System details:

Platform: Windows 11
Python: Python 3.9.13
Pytorch: torch==1.7.1+cu110

Also tried with:
- Python 3.10 with torch 1.12.0 and CUDA 11.6 (on Windows),
- Python 3.8 with torch 1.7 and CUDA 10.2 (on Ubuntu)
with same results.

Caching of reference images obtained with get_stats_reference are not saved correctly

Reference images obtained via get_stats_reference are not saved correctly in the ImageCache class.

rachtibat / zennit-crp Goto Github PK

zennit-crp's People

Contributors

Stargazers

Watchers

Forkers

zennit-crp's Issues

Issue: Common Import Problem with CRP and Zennit 0.4.6

Background

Problem

Solution

Suggestions for Resolution

Recommend Projects

Recommend Topics

Recommend Org