Giter Site home page Giter Site logo

ai4life-group / openxai Goto Github PK

View Code? Open in Web Editor NEW
213.0 6.0 32.0 42.68 MB

OpenXAI : Towards a Transparent Evaluation of Model Explanations

Home Page: https://open-xai.github.io/

License: MIT License

Python 17.55% Jupyter Notebook 1.10% JavaScript 81.29% CSS 0.03% HTML 0.03%
benchmark explainability explainable-ai interpretability leaderboard reproducibility

openxai's Introduction

OpenXAI : Towards a Transparent Evaluation of Model Explanations


Website | arXiv Paper

OpenXAI is the first general-purpose lightweight library that provides a comprehensive list of functions to systematically evaluate the quality of explanations generated by attribute-based explanation methods. OpenXAI supports the development of new datasets (both synthetic and real-world) and explanation methods, with a strong bent towards promoting systematic, reproducible, and transparent evaluation of explanation methods.

OpenXAI is an open-source initiative that comprises of a collection of curated high-stakes datasets, models, and evaluation metrics, and provides a simple and easy-to-use API that enables researchers and practitioners to benchmark explanation methods using just a few lines of code.

Updates

  • 0.0.0: OpenXAI is live! Now, you can submit the result for benchmarking an post-hoc explanation method on an evaluation metric. Checkout here!
  • OpenXAI white paper is on arXiv!

Unique Features of OpenXAI

  • Diverse areas of XAI research: OpenXAI includes ready-to-use API interfaces for seven state-of-the-art feature attribution methods and 22 metrics to quantify their performance. Further, it provides a flexible synthetic data generator to synthesize datasets of varying sizes, complexity, and dimensionality that facilitate the construction of ground truth explanations and a comprehensive collection of real-world datasets.
  • Data functions: OpenXAI provides extensive data functions, including data evaluators, meaningful data splits, explanation methods, and evaluation metrics.
  • Leaderboards: OpenXAI provides the first ever public XAI leaderboards to promote transparency, and to allow users to easily compare the performance of multiple explanation methods.
  • Open-source initiative: OpenXAI is an open-source initiative and easily extensible.

Installation

Using pip

To install the core environment dependencies of OpenXAI, use pip by cloning the OpenXAI repo into your local environment:

pip install -e . 

Design of OpenXAI

OpenXAI is an open-source ecosystem comprising XAI-ready datasets, implementations of state-of-the-art explanation methods, evaluation metrics, leaderboards and documentation to promote transparency and collaboration around evaluations of post hoc explanations. OpenXAI can readily be used to benchmark new explanation methods as well as incorporate them into our framework and leaderboards. By enabling systematic and efficient evaluation and benchmarking of existing and new explanation methods, OpenXAI can inform and accelerate new research in the emerging field of XAI.

OpenXAI DataLoaders

OpenXAI provides a Dataloader class that can be used to load the aforementioned collection of synthetic and real-world datasets as well as any other custom datasets, and ensures that they are XAI-ready. More specifically, this class takes as input the name of an existing OpenXAI dataset or a new dataset (name of the .csv file), and outputs a train set which can then be used to train a predictive model, a test set which can be used to generate local explanations of the trained model, as well as any ground-truth explanations (if and when available). If the dataset already comes with pre-determined train and test splits, this class loads train and test sets from those pre-determined splits. Otherwise, it divides the entire dataset randomly into train (70%) and test (30%) sets. Users can also customize the percentages of train-test splits.

For a concrete example, the code snippet below shows how to import the Dataloader class and load an existing OpenXAI dataset:

from openxai.dataloader import ReturnLoaders
trainloader, testloader = ReturnLoaders(data_name=german’, download=True)
# get an input instance from the test dataset
inputs, labels = next(iter(loader_test))

OpenXAI Pre-trained models

We also pre-trained two classes of predictive models (e.g., deep neural networks of varying degrees of complexity, logistic regression models etc.) and incorporated them into the OpenXAI framework so that they can be readily used for benchmarking explanation methods. The code snippet below shows how to load OpenXAI’s pre-trained models using our LoadModel class.

from openxai import LoadModel
model = LoadModel(data_name= 'german', ml_model='ann', pretrained=True)

Adding additional pre-trained models into the OpenXAI framework is as simple as uploading a file with details about model architecture and parameters in a specific template. Users can also submit requests to incorporate custom pre-trained models into the OpenXAI framework by filling a simple form and providing details about model architecture and parameters.

OpenXAI Explainers

All the explanation methods included in OpenXAI are readily accessible through the Explainer class, and users just have to specify the method name in order to invoke the appropriate method and generate explanations as shown in the above code snippet. Users can easily incorporate their own custom explanation methods into the OpenXAI framework by extending the Explainer class and including the code for their methods in the get_explanations function of this class.

from openxai import Explainer
exp_method = Explainer(method='lime', model=model)
explanations= exp_method.get_explanations(inputs)

Users can then submit a request to incorporate their custom methods into OpenXAI library by filling a form and providing the GitHub link to their code as well as a summary of their explanation method.

OpenXAI Evaluation

Benchmarking an explanation method using evaluation metrics is quite simple and the code snippet below describes how to invoke the RIS metric. Users can easily incorporate their own custom evaluation metrics into OpenXAI by filling a form and providing the GitHub link to their code as well as a summary of their metric. Note that the code should be in the form of a function which takes as input data instances, corresponding model predictions and their explanations, as well as OpenXAI’s model object and returns a numerical score. Finally, the input_dict is described here.

from openxai import Evaluator
metric_evaluator = Evaluator(model, metric='PGI')
score = metric_evaluator.evaluate(**kwargs)

OpenXAI Metrics

Ground-truth Faithfulness

OpenXAI includes the following metrics to calculate the agreement between ground-truth explanations (i.e., coefficients of logistic regression models) and explanations generated by state-of-the-art methods.

  1. Feature Agreement (FA) metric computes the fraction of top-K features that are common between a given post hoc explanation and the corresponding ground truth explanation.
  2. Rank Agreement (RA) metric measures the fraction of top-K features that are not only common between a given post hoc explanation and the corresponding ground truth explanation, but also have the same position in the respective rank orders.
  3. Sign Agreement (SA) metric computes the fraction of top-K features that are not only common between a given post hoc explanation and the corresponding ground truth explanation, but also share the same sign (direction of contribution) in both the explanations.
  4. Signed Rank Agreement (SRA) metric computes the fraction of top-K features that are not only common between a given post hoc explanation and the corresponding ground truth explanation, but also share the same feature attribution sign (direction of contribution) and position (rank) in both the explanations.
  5. Rank Correlation (RC) metric computes the Spearman’s rank correlation coefficient to measure the agreement between feature rankings provided by a given post hoc explanation and the corresponding ground truth explanation.
  6. Pairwise Rank Agreement (PRA) metric captures if the relative ordering of every pair of features is the same for a given post hoc explanation as well as the corresponding ground truth explanation i.e., if feature A is more important than B according to one explanation, then the same should be true for the other explanation. More specifically, this metric computes the fraction of feature pairs for which the relative ordering is the same between the two explanations.

Predicted Faithfulness

OpenXAI includes two complementary predictive faithfulness metrics: i) Prediction Gap on Important feature perturbation (PGI) which measures the difference in prediction probability that results from perturbing the features deemed as influential by a given post hoc explanation, and ii) Prediction Gap on Unimportant feature perturbation (PGU) which measures the difference in prediction probability that results from perturbing the features deemed as unimportant by a given post hoc explanation.

Stability

OpenXAI incorporates three stability metrics: i) Relative Input Stability (RIS) which measure the maximum change in explanation relative to changes in the inputs, ii) Relative Representation Stability (RRS) which measure the maximum change in explanation relative to changes in the internal representation learned by the model, and iii) Relative Output Stability (ROS) which measure the maximum change in explanation relative to changes in output prediction probabilities.

Fairness

We report the average of all faithfulness and stability metric values across instances in the majority and minority subgroups, and then take the absolute difference between them to check if there are significant disparities.

OpenXAI Leaderboards

Every explanation method in OpenXAI is a benchmark, and we provide dataloaders, pre-trained models, together with explanation methods and performance evaluation metrics. To participate in the leaderboard for a specific benchmark, follow these steps:

  • Use the OpenXAI benchmark dataloader to retrieve a given dataset.

  • Use the OpenXAI LoadModel to load a pre-trained model.

  • Use the OpenXAI Explainer to load a post hoc explanation method.

  • Submit the performance of the explanation method for a given metric.

Cite Us

If you find OpenXAI benchmark useful, cite our paper:

@inproceedings{
agarwal2022openxai,
title={Open{XAI}: Towards a Transparent Evaluation of Model Explanations},
author={Chirag Agarwal and Satyapriya Krishna and Eshika Saxena and Martin Pawelczyk and Nari Johnson and Isha Puri and Marinka Zitnik and Himabindu Lakkaraju},
booktitle={Thirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track},
year={2022},
url={https://openreview.net/forum?id=MU2495w47rz}
}

Contact

Reach us at [email protected] or open a GitHub issue.

License

OpenXAI codebase is under MIT license. For individual dataset usage, please refer to the dataset license found on the website.

openxai's People

Contributors

chirag126 avatar danwley avatar ehsankiakojouri avatar eshika avatar hnaik avatar jiaqima avatar s1682978 avatar y0mingzhang avatar y12uc231 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

openxai's Issues

Explicit versions of dependencies necessary?

Hi everyone,
after your fix in #7 for installation, explicit versions are assigned to each of the depent packages. Is this really necessary? At the moment, some dependent package versions conflict with my existing environments. Please ignore this issue if the explicit versions are really necessary, otherwise I would recommend to rather use "lower bounds" for the package versions.
Thank you!

Not install using the pip command

Hi,

I am trying to use the OpenXAI tool in the google colab environment. But when I using the pip command the following error is show:

"ERROR: File "setup.py" or "setup.cfg" not found. Directory cannot be installed in editable mode: /content"

Could you please provide any solution of this problem.

`evaluator.evaluate` uses only single sample

Hi, and thank you for your awesome repository!

I recently tried using this library to validate some attribution methods, specifically by utilizing evaluator.evaluate.

However, I discovered that it only uses an attribution map from a single sample. link

        attrA = self.gt_feature_importances.detach().numpy().reshape(1, -1)
        attrB = self.explanation_x_f.detach().numpy().reshape(1, -1)

Is this behavior expected? The comment mentions evaluator.evaluate using n x m samples, so I am confused.

Fairness evaluation

Thanks for the great repo. It really helps to have the different explanation methods and evaluators at one place.
I do not see the code for fairness evaluation in the repo. Where can I find it?

Issue with pillow library compilation during install with PIP

Hello Team,

I am Anand from University of Stuttgart, pursuing my doctoral research. While installing the open XAI using PIP command there is a compilation error reported with respect to pillow..

writing src/Pillow.egg-info/PKG-INFO
writing dependency_links to src/Pillow.egg-info/dependency_links.txt
writing top-level names to src/Pillow.egg-info/top_level.txt
reading manifest file 'src/Pillow.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
_warning: no files found matching '*.c'
warning: no files found matching '*.h'
warning: no files found matching '*.sh'
warning: no previously-included files found matching '.appveyor.yml'
warning: no previously-included files found matching '.clang-format'
warning: no previously-included files found matching '.coveragerc'
warning: no previously-included files found matching '.editorconfig'
warning: no previously-included files found matching '.readthedocs.yml'
warning: no previously-included files found matching 'codecov.yml'
warning: no previously-included files matching '.git*' found anywhere in distribution
warning: no previously-included files matching '*.pyc' found anywhere in distribution
warning: no previously-included files matching '*.so' found anywhere in distribution
no previously-included directories found matching '.ci'_
writing manifest file 'src/Pillow.egg-info/SOURCES.txt'
running build_ext
    
The headers or library files could not be found for jpeg,
a required dependency when compiling Pillow from source.

Traceback (most recent call last):
File "", line 1, in
File "/tmp/pip-build-8_fu5dva/pillow/setup.py", line 1037, in
raise RequiredDependencyException(msg)
main.RequiredDependencyException:

Command "/usr/bin/python3 -u -c "import setuptools, tokenize;file='/tmp/pip-build-8_fu5dva/pillow/setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" install --record /tmp/pip-p4qugs1n-record/install-record.txt --single-version-externally-managed --compile --user --prefix=" failed with error code 1 in /tmp/pip-build-8_fu5dva/pillow/

Am I missing some thing else in the setup (error message says some header files and other file extensions are missing)

Your feedback would be of great help!

Thanks in advance

Best regards,
Anand

[FEATURE REQUEST] Support for regression based problems

Are you planning to add support for regression tasks ?
If so, I'd suggest replacing the exact equality condition of the original and perturbed input predictions in the stability metrics (ŷ_x == ŷ_x') to something like |ŷ_x- ŷ_x'|< eps, where eps is a user-defined range.

Does `evaluator.evaluate(metric="PGI")` actually compute PGU?

As the comment in the code snippet below states, the features in the top-K are kept static. For PGI we want the opposite. We want the features in the top-K to be perturbed and the rest to remain static.

        # keeping features static that are in top-K based on feature mask
        perturbed_samples = original_sample * feature_mask + perturbations * (~feature_mask)

(code snippet from NormalPerturbation.get_perturbed_inputs in explainers/catalog/perturbation_methods.py)

AttributeError: module 'numpy' has no attribute 'float'.

I am using numpy version 1.24.3 (>=1.19.5). But I am getting the below error. I did not see any mention of the python version, so I used python=3.10.

AttributeError Traceback (most recent call last)
Cell In[4], line 16
13 from openxai.Explainer import Explainer
15 # Evaluation methods
---> 16 from openxai.evaluator import Evaluator
18 # Perturbation methods required for the computation of the relative stability metrics
19 from openxai.explainers.catalog.perturbation_methods import NormalPerturbation

File ~/OpenXAI/openxai/init.py:3
1 from .LoadModel import LoadModel
2 from .Explainer import Explainer
----> 3 from .evaluator import Evaluator

File ~/OpenXAI/openxai/evaluator.py:9
5 from scipy.special import comb
6 import pandas as pd
----> 9 class Evaluator():
10 """ Metrics to evaluate an explanation method.
11 """
13 def init(self, input_dict: dict, inputs, labels, model, explainer):

File ~/OpenXAI/openxai/evaluator.py:381, in Evaluator()
377 y_perturbed = self._arr(self.model(x_perturbed)) ###
379 return np.mean(np.abs(y - y_perturbed), axis=0)[0]
...

AttributeError: module 'numpy' has no attribute 'float'.
np.float was a deprecated alias for the builtin float. To avoid this error in existing code, use float by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use np.float64 here.
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
Output is truncated. View as a scrollable element or open in a text editor. Adjust cell output settings...

Then I tried removing the .float() from the entire code in the evaluator.py, but the error remained the same. I also tried float(<variable_to_be_converted)>, but the error did not change.
What else should I try?

[FEATURE REQUEST] Include the installation of dependency packages in setup.py

The current setup.py does not include dependency packages such as captum. Ideally we should include the installation of dependency packages in setup.py so that 1) the end user does not need to install them manually and 2) makes it more self-contained when someone wants to include OpenXAI as a dependency in their package.

[FEATURE REQUEST] Clean up hard-coded paths

There are some hard-coded paths such as ./openxai/ML_Models/Saved_Models/ANN/gaussian_lr_0.002_acc_0.91.pt. This is not very friendly to an external API call as one may not be running code at the root of OpenXAI folder.

We should refactor the code to clean up such hard-coded paths.

A good solution might be hosting these files on a Google Drive, which will also reduce the repo size.

[FEATURE REQUEST] Refactor `Explainer` API

The current Explainer class requires method, model, and dataset_tensor as required arguments. It takes quite some extra lines of code for the user to construct dataset_tensor. We should try to eliminate this required argument.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.