Giter Site home page Giter Site logo

trusted-ai / aif360 Goto Github PK

View Code? Open in Web Editor NEW
2.3K 90.0 815.0 6.67 MB

A comprehensive set of fairness metrics for datasets and machine learning models, explanations for these metrics, and algorithms to mitigate bias in datasets and models.

Home Page: https://aif360.res.ibm.com/

License: Apache License 2.0

Python 95.31% R 3.66% Dockerfile 0.02% Java 1.01%
ai fairness-ai fairness fairness-testing fairness-awareness-model bias-detection bias bias-correction bias-reduction bias-finder

aif360's Introduction

AI Fairness 360 (AIF360)

Continuous Integration Documentation PyPI version CRAN_Status_Badge

The AI Fairness 360 toolkit is an extensible open-source library containing techniques developed by the research community to help detect and mitigate bias in machine learning models throughout the AI application lifecycle. AI Fairness 360 package is available in both Python and R.

The AI Fairness 360 package includes

  1. a comprehensive set of metrics for datasets and models to test for biases,
  2. explanations for these metrics, and
  3. algorithms to mitigate bias in datasets and models. It is designed to translate algorithmic research from the lab into the actual practice of domains as wide-ranging as finance, human capital management, healthcare, and education. We invite you to use it and improve it.

The AI Fairness 360 interactive experience provides a gentle introduction to the concepts and capabilities. The tutorials and other notebooks offer a deeper, data scientist-oriented introduction. The complete API is also available.

Being a comprehensive set of capabilities, it may be confusing to figure out which metrics and algorithms are most appropriate for a given use case. To help, we have created some guidance material that can be consulted.

We have developed the package with extensibility in mind. This library is still in development. We encourage the contribution of your metrics, explainers, and debiasing algorithms.

Get in touch with us on Slack (invitation here)!

Supported bias mitigation algorithms

Supported fairness metrics

  • Comprehensive set of group fairness metrics derived from selection rates and error rates including rich subgroup fairness
  • Comprehensive set of sample distortion metrics
  • Generalized Entropy Index (Speicher et al., 2018)
  • Differential Fairness and Bias Amplification (Foulds et al., 2018)
  • Bias Scan with Multi-Dimensional Subset Scan (Zhang, Neill, 2017)

Setup

R

install.packages("aif360")

For more details regarding the R setup, please refer to instructions here.

Python

Supported Python Configurations:

OS Python version
macOS 3.8 – 3.11
Ubuntu 3.8 – 3.11
Windows 3.8 – 3.11

(Optional) Create a virtual environment

AIF360 requires specific versions of many Python packages which may conflict with other projects on your system. A virtual environment manager is strongly recommended to ensure dependencies may be installed safely. If you have trouble installing AIF360, try this first.

Conda

Conda is recommended for all configurations though Virtualenv is generally interchangeable for our purposes. Miniconda is sufficient (see the difference between Anaconda and Miniconda if you are curious) if you do not already have conda installed.

Then, to create a new Python 3.11 environment, run:

conda create --name aif360 python=3.11
conda activate aif360

The shell should now look like (aif360) $. To deactivate the environment, run:

(aif360)$ conda deactivate

The prompt will return to $ .

Install with pip

To install the latest stable version from PyPI, run:

pip install aif360

Note: Some algorithms require additional dependencies (although the metrics will all work out-of-the-box). To install with certain algorithm dependencies included, run, e.g.:

pip install 'aif360[LFR,OptimPreproc]'

or, for complete functionality, run:

pip install 'aif360[all]'

The options for available extras are: OptimPreproc, LFR, AdversarialDebiasing, DisparateImpactRemover, LIME, ART, Reductions, FairAdapt, inFairness, LawSchoolGPA, notebooks, tests, docs, all

If you encounter any errors, try the Troubleshooting steps.

Manual installation

Clone the latest version of this repository:

git clone https://github.com/Trusted-AI/AIF360

If you'd like to run the examples, download the datasets now and place them in their respective folders as described in aif360/data/README.md.

Then, navigate to the root directory of the project and run:

pip install --editable '.[all]'

Run the Examples

To run the example notebooks, complete the manual installation steps above. Then, if you did not use the [all] option, install the additional requirements as follows:

pip install -e '.[notebooks]'

Finally, if you did not already, download the datasets as described in aif360/data/README.md.

Troubleshooting

If you encounter any errors during the installation process, look for your issue here and try the solutions.

TensorFlow

See the Install TensorFlow with pip page for detailed instructions.

Note: we require 'tensorflow >= 1.13.1'.

Once tensorflow is installed, try re-running:

pip install 'aif360[AdversarialDebiasing]'

TensorFlow is only required for use with the aif360.algorithms.inprocessing.AdversarialDebiasing class.

CVXPY

On MacOS, you may first have to install the Xcode Command Line Tools if you never have previously:

xcode-select --install

On Windows, you may need to download the Microsoft C++ Build Tools for Visual Studio 2019. See the CVXPY Install page for up-to-date instructions.

Then, try reinstalling via:

pip install 'aif360[OptimPreproc]'

CVXPY is only required for use with the aif360.algorithms.preprocessing.OptimPreproc class.

Using AIF360

The examples directory contains a diverse collection of jupyter notebooks that use AI Fairness 360 in various ways. Both tutorials and demos illustrate working code using AIF360. Tutorials provide additional discussion that walks the user through the various steps of the notebook. See the details about tutorials and demos here

Citing AIF360

A technical description of AI Fairness 360 is available in this paper. Below is the bibtex entry for this paper.

@misc{aif360-oct-2018,
    title = "{AI Fairness} 360:  An Extensible Toolkit for Detecting, Understanding, and Mitigating Unwanted Algorithmic Bias",
    author = {Rachel K. E. Bellamy and Kuntal Dey and Michael Hind and
	Samuel C. Hoffman and Stephanie Houde and Kalapriya Kannan and
	Pranay Lohia and Jacquelyn Martino and Sameep Mehta and
	Aleksandra Mojsilovic and Seema Nagar and Karthikeyan Natesan Ramamurthy and
	John Richards and Diptikalyan Saha and Prasanna Sattigeri and
	Moninder Singh and Kush R. Varshney and Yunfeng Zhang},
    month = oct,
    year = {2018},
    url = {https://arxiv.org/abs/1810.01943}
}

AIF360 Videos

  • Introductory video to AI Fairness 360 by Kush Varshney, September 20, 2018 (32 mins)

Contributing

The development fork for Rich Subgroup Fairness (inprocessing/gerryfair_classifier.py) is here. Contributions are welcome and a list of potential contributions from the authors can be found here.

aif360's People

Contributors

adebayo-oshingbesan avatar adrinjalali avatar animeshsingh avatar autoih avatar baraldian avatar barvek avatar ckadner avatar dependabot[bot] avatar gdequeiroz avatar hirzel avatar hoffmansc avatar imgbot[bot] avatar ivesulca avatar jimbudarz avatar josue-rodriguez avatar krvarshney avatar leenamurgai avatar mariaborbones avatar mfeffer avatar michaelhind avatar milevavantuyl avatar mnagired avatar monindersingh avatar nrkarthikeyan avatar pronics2004 avatar romeokienzler avatar sohiniu avatar sreeja-g avatar ssaishruthi avatar zywind avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

aif360's Issues

features_to_drop in Class MEPSDataset19 not working

I tried to run the MEPSDataset19 example with features_to_drop. It ran with some features but not with some eg. 'PHQ242'

Error it gave: KeyError: "['PHQ242'] not in index"

Not sure if I did something wrong but found a workaround by not keeping it in features_to_keep so that the question to drop does not arise.

make estimators and scorers sklearn compatible

The inprocessing algorithms are basically like an Estimator. Ideally, it should be possible to replace a classifier in a scikit-learn pipeline, with one from aif360. An example pipeline (taken from this example) looks like:

model = make_pipeline(StandardScaler(),
                      LogisticRegression(solver='liblinear'))
X_train = meps_orig_train.features
y_train = meps_orig_train.labels.ravel()

model_lr = model.fit(X_train, y_train,
                     **{"logisticregression__sample_weight":meps_orig_train.instance_weights})

But when we move to use a model such as PrejudiceRemover, we have to break the pipeline and have the model separate, since it doesn't follow the API requirements to be fit for a pipeline.

model = PrejudiceRemover(sensitive_attr=sens_attr, eta = 25.0)
scale = StandardScaler().fit(tr_dataset.features)

tr_dataset.features = scale.transform(tr_dataset.features)
model.fit(tr_dataset)

Once it can be fit in a pipeline, then we can use all the other mechanisms already available in sklearn, such as GridSearchCV to find best hyperparameters for the problem at hand.

That also brings are to the scorers. AIF36's scorers also don't fit into the scoring mechanism of sklearn. Once they do, we could use functions such as make_scorer to create a scoring function and feed it into the sklearn's GridSearchCV for instance.

This may not be a trivial task, and some useful things such as having multiple scoring functions recorded and reported by the grid search are still in discussion and not yet available in sklearn. Until then, we can provide an easy way for the users to combine antibias scoring functions with performance scoring ones and use them to choose their best pipeline.

Another point regarding the API conventions is that if the preprocessing modules also fit in the sklearn.Pipeline as a transformer, then we can put their selection also in a hyperparameter search and do the search much easier than having to manually run them one by one and going through them to find the best solution.

Right now transformers which would change the number of samples or change the output are not supported in sklearn (AFAIK), but that's also in discussion and this usecase may be a good push for it.

credit score tutorial

I would like to run this tutorial in Watson Studio. How can I access data files?
How can I change path of data files?

Pre-processing for adult, german and compas dataset

I am trying to use AIF360 tool in one of my projects. I have facing problem in understanding the purpose of pre-processing, say, german credit dataset as described in the file AIF360/aif360/algorithms/preprocessing/optim_preproc_helpers/data_preproc_functions.py.
Why is the custom processing described here needed in several algorithms like optimum pre-processing, meta classifier, reject classification. On the other hand, several algorithms do not require this for eg. adversarial debiasing, disparate impact remover, reweighing. Can you please help me understand the purpose?

Thank you.

fix deprecation warning in standard dataset

WARNING:root:Missing Data: 3620 rows removed from AdultDataset.
.../python3.7/site-packages/aif360/datasets/standard_dataset.py:121: FutureWarning: outer method for ufunc <ufunc 'equal'> is not implemented on pandas objects. Returning an ndarray, but in the future this will raise a 'NotImplementedError'. Consider explicitly converting the Series to an array with '.array' first.
priv = np.logical_or.reduce(np.equal.outer(vals, df[attr]))

The classification metric of accuracy, precision, and recall differs from scikit-learn

import pandas as pd
import sys
import numpy as np
np.random.seed(0)
from aif360.datasets import StructuredDataset as SD
from aif360.datasets import BinaryLabelDataset as BLD
from aif360.metrics import ClassificationMetric as CM
from aif360.metrics import BinaryLabelDatasetMetric
from aif360.algorithms.preprocessing import Reweighing
from sklearn.ensemble import RandomForestClassifier as RF
from sklearn.datasets import make_classification as mc 
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score

data, label = mc(n_samples=10000,n_features=30)
bias_feature = label.copy()
np.random.shuffle(bias_feature)
agg_data = np.hstack([data,  bias_feature.reshape(-1,1), label.reshape(-1,1),])
pd_data = pd.DataFrame(agg_data, columns=list(range(1,31)) + ["gender", "labels"])
dataset = BLD(favorable_label=0, unfavorable_label=1,df=pd_data,
              label_names=["labels"], protected_attribute_names=["gender"], 
              privileged_protected_attributes=[2])
dataset_orig_train, dataset_orig_test = dataset.split([0.7], shuffle=True)
dataset_orig_test_pred = dataset_orig_test.copy(deepcopy=True)
privileged_groups = [{'gender': 0}]
unprivileged_groups = [{'gender': 1}]
metric_orig_train = BinaryLabelDatasetMetric(dataset_orig_train, 
                                             unprivileged_groups=unprivileged_groups,
                                             privileged_groups=privileged_groups)
clf = RF()
clf.fit(dataset_orig_train.features,dataset_orig_train.labels)

predictions = clf.predict(dataset_orig_test.features)
proba_predictions = clf.predict_proba(dataset_orig_test.features)

dataset_orig_test_pred.scores = proba_predictions[:,0].reshape(-1,1)
dataset_orig_test_pred.labels = predictions.reshape(-1, 1)

cm_pred_valid = CM(dataset_orig_test, dataset_orig_test_pred, unprivileged_groups=unprivileged_groups,
                             privileged_groups=privileged_groups)

cm = ["precision","recall", "accuracy"]


metrics = {}
for c in cm:
    metric = eval("cm_pred_valid." + c + "()")
    metrics[c] =  metric


metrics["recall"], metrics["accuracy"], metrics["precision"]


print("Scikit-learn metrics")
for key,value in {"recall": recall_score,"accuracy": accuracy_score, "precision": precision_score}.items():
    metric = value(dataset_orig_test.labels,predictions)
    print("{} score is: {}".format(key,metric))

print("AIF360 metrics")
for key in ["recall","accuracy", "precision"]:
    print("{} score is: {}".format(key,metrics[key]))

produces the following:

i.e. for scikit-learn

recall score is: 0.8780649436713055
accuracy score is: 0.8856666666666667
precision score is: 0.8928571428571429

and for AIF360

recall score is: 0.8933601609657947
accuracy score is: 0.8856666666666667
precision score is: 0.8786279683377308

Missing dependencies: ModuleNotFoundError: No module named 'numba'

from aif360.datasets import GermanDataset
from aif360.metrics import BinaryLabelDatasetMetric
from aif360.algorithms.preprocessing import Reweighing


ModuleNotFoundError Traceback (most recent call last)
in
1 from aif360.datasets import GermanDataset
2 from aif360.metrics import BinaryLabelDatasetMetric
----> 3 from aif360.algorithms.preprocessing import Reweighing

~/.local/lib/python3.6/site-packages/aif360/algorithms/preprocessing/init.py in
1 from aif360.algorithms.preprocessing.disparate_impact_remover import DisparateImpactRemover
----> 2 from aif360.algorithms.preprocessing.lfr import LFR
3 from aif360.algorithms.preprocessing.optim_preproc import OptimPreproc
4 from aif360.algorithms.preprocessing.reweighing import Reweighing

~/.local/lib/python3.6/site-packages/aif360/algorithms/preprocessing/lfr.py in
3
4 from aif360.algorithms import Transformer
----> 5 from aif360.algorithms.preprocessing.lfr_helpers import helpers as lfr_helpers
6
7

~/.local/lib/python3.6/site-packages/aif360/algorithms/preprocessing/lfr_helpers/helpers.py in
1 # Based on code from https://github.com/zjelveh/learning-fair-representations
----> 2 from numba.decorators import jit
3 import numpy as np
4
5 @jit

ModuleNotFoundError: No module named 'numba'

Memory issues while opening StandardDataset

import pandas as pd
import sys
import numpy as np
np.random.seed(0)
from aif360.datasets import StructuredDataset as SD
from aif360.datasets import BinaryLabelDataset as BLD
from aif360.metrics import ClassificationMetric as CM
from aif360.metrics import BinaryLabelDatasetMetric
from aif360.algorithms.preprocessing import Reweighing
from sklearn.ensemble import RandomForestClassifier as RF
from sklearn.datasets import make_classification as mc 
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score

data, label = mc(n_samples=10000,n_features=30)
bias_feature = label.copy()
np.random.shuffle(bias_feature)
agg_data = np.hstack([data,  bias_feature.reshape(-1,1), label.reshape(-1,1),])
pd_data = pd.DataFrame(agg_data, columns=list(range(1,31)) + ["gender", "labels"])
dataset = BLD(favorable_label=0, unfavorable_label=1,df=pd_data,
              label_names=["labels"], protected_attribute_names=["gender"], 
              privileged_protected_attributes=[2])

running BLD(favorable_label=0, unfavorable_label=1,df=pd_data, label_names=["labels"], protected_attribute_names=["gender"], privileged_protected_attributes=[2])
in a python jupyter notebook 3 times runs in a memoryerror

Use sphinx-gallery and `.py` examples instead of ipynb

Using sphinx and sphinx-gallery we can generate docs and ipynb notebooks automatically from python example files.

This would greatly improve traceability of changes in the examples, and make can use them as a part of tests.

Subset of Dataset

I want to train in-processing algorithms like Prejudice Remover, ART classifier, etc. using a specific subset of dataset like Adult Income, etc. Is there a way to create a subset based on row numbers like df.loc[[2,3,4,5]]?
If so, I can then train and predict using the existing api which take 'dataset' type as input for fit & predict function respectively.
Not sure if this functionality is supported out of the box or if there is some workaround?

Include additional classification metrics

Include the following metrics:

  1. Equalized odds difference: max(|FPR_unpriv - FPR_priv|, |TPR_unpriv - TPR_priv|)
  2. Generalized equalized odds difference: max(|GFPR_unpriv - GFPR_priv|, |GTPR_unpriv - GTPR_priv|)
  3. Generalized selection rate: mean score possibly conditioned by the group E[\hat{S}]

Upgrade tensorflow>=1.12.1

Upgrade tensorflow to version 1.12.1 or later for security fixes
Details
CVE-2019-9635 More information
moderate severity
Vulnerable versions: >= 1.0.0, < 1.12.1
Patched version: 1.12.1
NULL pointer dereference in Google TensorFlow before 1.12.2 could cause a denial of service via an invalid GIF file.

CVE-2018-7575 More information
critical severity
Vulnerable versions: >= 1.0.0, < 1.7.1
Patched version: 1.7.1
Google TensorFlow 1.7.x and earlier is affected by a Buffer Overflow vulnerability. The type of exploitation is context-dependent.

CVE-2018-7577 More information
high severity
Vulnerable versions: >= 1.1.0, < 1.7.1
Patched version: 1.7.1
Memcpy parameter overlap in Google Snappy library 1.1.4, as used in Google TensorFlow before 1.7.1, could result in a crash or read from other parts of process memory.

CVE-2018-10055 More information
high severity
Vulnerable versions: >= 1.1.0, < 1.7.1
Patched version: 1.7.1
Invalid memory access and/or a heap buffer overflow in the TensorFlow XLA compiler in Google TensorFlow before 1.7.1 could cause a crash or read from other parts of process memory via a crafted configuration file.

CVE-2018-7576 More information
moderate severity
Vulnerable versions: >= 1.0.0, < 1.6.0
Patched version: 1.6.0
Google TensorFlow 1.6.x and earlier is affected by: Null Pointer Dereference. The type of exploitation is: context-dependent.

Typo in the German Credit Scoring tutorial

In step 5 of the Credit Scoring notebook, there is a typo in the section below. The decimal place is shifted 1 to the left in this text above the cell: the difference in mean outcomes is now 0.18250. The value should instead by 0.018250 as shown below the cell.

image

Undefined name: CaldersVerwerTwoNaiveBayes in test_cv2nb.py

This failure only happens on Python 3 and is probably caused by the scoping difference in Python 2 and Python 3.

flake8 testing of https://github.com/IBM/AIF360 on Python 3.6

$ flake8 . --count --select=E901,E999,F821,F822,F823 --show-source --statistics

./aif360/algorithms/inprocessing/kamfadm-2012ecmlpkdd/fadm/nb/tests/test_cv2nb.py:18:26: F821 undefined name 'CaldersVerwerTwoNaiveBayes'
        self.assertEqual(CaldersVerwerTwoNaiveBayes.N_CLASSES, 2)
                         ^
./aif360/algorithms/inprocessing/kamfadm-2012ecmlpkdd/fadm/nb/tests/test_cv2nb.py:19:26: F821 undefined name 'CaldersVerwerTwoNaiveBayes'
        self.assertEqual(CaldersVerwerTwoNaiveBayes.N_S_VALUES, 2)
                         ^
./aif360/algorithms/inprocessing/kamfadm-2012ecmlpkdd/fadm/nb/tests/test_cv2nb.py:20:13: F821 undefined name 'CaldersVerwerTwoNaiveBayes'
        m = CaldersVerwerTwoNaiveBayes(5, [2, 2, 2, 2, 3], 1.0, 0.8)
            ^
3    F821 undefined name 'CaldersVerwerTwoNaiveBayes'
3

GerryFairClassifier Pre and Postprocessing Equivalent

Is there pre and post-processing techniques available with respect to rich subgroup fairness. And is there any studies comparing model-agnostic methods with model-specific methods? I just can't imagine none existing, I have looked everywhere. Thanks.

Remove gender recognition example

Automated Gender Recognition has been thoroughly shown to be fundamentally discriminatory against trans people. Given that, it's somewhat bizzarre to see a tutorial in how to make it "fairer" in a project aiming to reduce discriminatory outcomes from ML. I would really suggest (and appreciate it!) if the example was removed - it has no place here.

Reconsider use of "bias" in README

There are some issues with the way "bias" is being used in the README that are both inconsistent with what is actually possible with the tool & the current state of social science.

"The AI Fairness 360 toolkit is an open-source library to help detect and remove bias in machine learning models. The AI Fairness 360 Python package includes a comprehensive set of metrics for datasets and models to test for biases, explanations for these metrics, and algorithms to mitigate bias in datasets and models."

While it's clear that we intend Bias in this context to meant "prejudice" or a disproportionate skew towards something, we must consider the larger impacts of this perspective. First off, it is impossible to not be prejudiced. In the social sciences, we call this "subjectivity". While some disciplines treat "objectivity" as an ideal, this concept is not transferable to human behavior, which is what we are modeling in our software applications. All humans have a distinct perspective, therefore all humans are prejudiced in some way.

The data & software applications that AIF360 seeks to "de-bias" were made by people based on their assumptions about how the world works & what they think matters. A software application is one group of people's collective opinion about what their stakeholders need. The structure of the software itself is based on these people's experience & is therefore prejudiced. For example, efforts like Gendermag & A11y are helping people working in open source projects address assumptions about how their stakeholders are using their software tools at a very fundamental level.

When training data is both curated & labeled, assumptions are made by the curators & it's clear from the description of the project that this is what AIF360 can help to address. However we must also acknowledge that AIF360 is also prejudiced by the assumptions of its creators & maintainers, & therefore can neither remove bias nor de-bias.

What is "fair" & "correct" is highly situational. What is "fair" in one situation may not be "fair" in another. In some social situations, such as in Law Enforcement, the lack of fairness is indicative of larger social issues & "de-biasing" could potentially further harm disadvantaged stakeholders who have been excluded from having their own voice. We should assume such deviations on "fairness" to be the norm, not an aberration. This assumption of a universal ideal norm (or bias if you will) exists within AIF360 itself.

I suggest we reconsider that bias & prejudice are tied to something inherent in the human condition & are therefore unavoidable. Instead of a definition that dictates our own biased "Truth" through the removal of any non-conforming perspectives, I propose we re-define bias as simply "limited perspective" & re-evaluate our language & explanations from that starting point.

This issue is quite significant because our framing of "bias" presents potential harms to IBM's own credibility & intention. IBM has historically had moments where we were very narrow & short-sighted in our contribution to software projects. The global scale of our Thought-Leadership & influence meant our mistakes significantly impacted people's lives. That said, for all the harms we've done, we achieve great things too! IBM has the advantage of having a much longer-term history than other tech companies to draw upon. Given the scale of our potential impact, we have a corporate social responsibility to thoughtfully consider who is at stake when we bring new ideas into the world.

"To visualize the future of IBM, you must know something of the past" -Thomas J. Watson, Sr.

“Now, thanks to confidential corporate documents and interviews with many of the technologists involved in developing the software, The Intercept and the Investigative Fund have learned that IBM began developing this object identification technology using secret access to NYPD camera footage. With access to images of thousands of unknowing New Yorkers offered up by NYPD officials, as early as 2012, IBM was creating new search features that allow other police departments to search camera footage for images of people by hair color, facial hair, and skin tone.”

Documents - https://www.documentcloud.org/documents/4452844-IBM-SVS-Analytics-4-0-Plan-Update-for-NYPD-6.html

  • That time when IBM helped Indiana kick people off Welfare

“In November 2006, Indiana Governor Mitch Daniels announced a 10-year, $1.16 billion contract with a consortium of tech companies, led by IBM and Affiliated Computer Services (ACS), to modernize and privatize eligibility procedures for the state’s Medicaid, food-stamp, and cash-assistance programs.”

“The design of electronic-governance systems affects our material well-being, the health of our democracy, and equity in our communities. But somehow, when we talk about data-driven government, we conveniently omit the often terrible impacts that these systems have on the poor and working-class people”

Further reading on the "bias" in software itself:

  • Forsythe, Diana E., and David J. Hess. Studying Those Who Study Us: An Anthropologist in the World of Artificial Intelligence. Stanford University Press, 2002
  • Eubanks, Virginia. Automating Inequality: How High-Tech Tools Profile, Police, and Punish the Poor. Picador, 2019.

(enhancement) pre/post/in processing classes within aif360.algorithms could take in key word arguments during instantiation

For example, refer to module aif360.preprocessing.optim_preproc.py

In order to create a OptimPreproc object, a user may have to pass in a bunch of arguments - that will be a mix of key word and positional arguments. In the below constructor, optiom_options is a dictionary but the rest is just positional.

class OptimPreproc(Transformer):
       def  __init__(self, optimizer, optim_options, unprivileged_groups=None,
               privileged_groups=None, verbose=False, seed=None):

It maybe better to simply pass in keyword arguments instead.

inverse_transform() not implemented for disparate impact remover

I am running the disparate impact demo. The output is encoded data and columns, and there isn't any inverse_transform function implemented in the code, as is the often the case, or maybe I am missing something ?

I would need to have the old 15-16 columns for post processing analysis, and try to avoid doing the recovery manually!

Documentation Typo for average_odds_difference()

Hi AIF360 team,

Thank you for the excellent repo & superb documentation!

I believe there is a typo for both:
https://aif360.readthedocs.io/en/latest/modules/metrics.html#aif360.metrics.ClassificationMetric.average_odds_difference

https://aif360.readthedocs.io/en/latest/modules/metrics.html#aif360.metrics.ClassificationMetric.average_abs_odds_difference

The implementation correctly returns

1/2[(FPRD=unprivileged−FPRD=privileged)+(TPRD=unprivileged - TPRD=privileged))]

vs the stated

1/2[(FPRD=unprivileged−FPRD=privileged)+(TPRD=privileged - TPRD=unprivileged))]

Docstrings incorrect for generalized FN, FP & TN

The comments for generalized FN, FP, TN in aif360.metrics.classification_metric, the definitions all incorrectly include the weighted sum of predicted scores where true labels are 'favorable' and need to be updated to reflect the appropriate labels e.g.

        """Return the generalized number of false negatives, :math:\`GFN\`, the
        weighted sum of predicted scores where true labels are 'favorable',
        optionally conditioned on protected attributes.``` 

The german credit dataset's privileged class should be Age > 25, not Age >= 25

https://github.com/IBM/AIF360/blob/c718f1d6cd11f1a7536a1317ea280abd961a7c79/aif360/datasets/german_dataset.py#L31

So I looked at the original paper on this dataset: F. Kamiran and T. Calders, “Classifying without discriminating,” They mentioned that 190 account holders were classified as young. This corresponds to classifying people <= 25 as Young. If set as < 25, it would remove about 50 instances from the Young group. This might be why the results on this dataset was unstable. Changing this should also improve the results on the website.

make flake8 on Travis PR diff aware, and change line length

Right now, the .travis.cli checks for some flake8 errors with giving out warnings, with this line:

# exit-zero treats all errors as warnings.  The GitHub editor is 127 chars wide
flake8 . --count --exit-zero --max-complexity=10 --max-line-length=127 --statistics

I know github's editor's max line is 127, but that's far too long for easily reviewing PRs on different systems and screens. I think it may make more sense to change that to 80 (what PEP8 recommends, with convincing arguments), and since we don't want to have to fix what already is in the repo, we can try and enforce that on new code.

scikit-learn for instance, uses a nice script which only fails only if there are new flake8 warnings introduced, which can be found here: https://github.com/scikit-learn/scikit-learn/blob/master/build_tools/travis/flake8_diff.sh

Support for MacOS

What OS was AIF360 built on? I'm attempting to run on a Macbook Pro with the latest Mac OS however I am getting errors when attempting to install the module 'cvxpy'. During the 'pip install cvxpy', I get numerous compilation errors:

  warning: include path for stdlibc++ headers not found; pass '-std=libc++' on the command line to use the libc++ standard library instead [-Wstdlibcxx-not-found]
  In file included from cvxpy/cvxcore/src/cvxcore.cpp:15:
  cvxpy/cvxcore/src/cvxcore.hpp:18:10: fatal error: 'vector' file not found
  #include <vector>
           ^~~~~~~~
  1 warning and 1 error generated.
  error: command 'gcc' failed with exit status 1
  
  ----------------------------------------
  Failed building wheel for cvxpy

Unable to replicate demo_lfr.ipynb notebook

Whilst trying to replicate demo_lrf notebook, I am getting following error.
I also tried changes mentioned in issue 83.

/AIF360/aif360/algorithms/preprocessing/lfr_helpers/helpers.py:62: NumbaWarning: 
Compilation is falling back to object mode WITH looplifting enabled because Function "LFR_optim_obj" failed type inference due to: Unknown attribute 'iters' of type recursive(type(CPUDispatcher(<function LFR_optim_obj at 0x7f5fea38a2f0>)))

File "AIF360/aif360/algorithms/preprocessing/lfr_helpers/helpers.py", line 66:
def LFR_optim_obj(params, data_sensitive, data_nonsensitive, y_sensitive,
    <source elided>

    LFR_optim_obj.iters += 1
    ^

[1] During: typing of get attribute at /AIF360/aif360/algorithms/preprocessing/lfr_helpers/helpers.py (66)

File "AIF360/aif360/algorithms/preprocessing/lfr_helpers/helpers.py", line 66:
def LFR_optim_obj(params, data_sensitive, data_nonsensitive, y_sensitive,
    <source elided>

    LFR_optim_obj.iters += 1
    ^

@jit
/AIF360/aif360/algorithms/preprocessing/lfr_helpers/helpers.py:62: NumbaWarning: 
Compilation is falling back to object mode WITHOUT looplifting enabled because Function "LFR_optim_obj" failed type inference due to: cannot determine Numba type of <class 'numba.dispatcher.LiftedLoop'>

File "AIF360/aif360/algorithms/preprocessing/lfr_helpers/helpers.py", line 85:
def LFR_optim_obj(params, data_sensitive, data_nonsensitive, y_sensitive,
    <source elided>
    L_z = 0.0
    for j in range(k):
    ^

@jit

unused requirements

Why are there a bunch of packages in this repo that are not used anywhere in the codebase? Specifically:
Orange3
scs
networkx

Furthermore, there are packages only used in the notebooks and not in the library:
all the ipython stuff
lime
tqdm
matplotlib

NotImplementedError in standard dataset

Related to #109. Previously it warned Returning an ndarray, but in the future this will raise a 'NotImplementedError'. but now it returns the following error:

NotImplementedError                       Traceback (most recent call last)
<ipython-input-3-3996a519ec26> in <module>()
      1 from aif360.datasets import AdultDataset
----> 2 data = AdultDataset()

/Users/staceyro/anaconda/envs/aif360/lib/python3.7/site-packages/aif360/datasets/adult_dataset.py in __init__(self, label_name, favorable_classes, protected_attribute_names, privileged_classes, instance_weights_name, categorical_features, features_to_keep, features_to_drop, na_values, custom_preprocessing, metadata)
    110             features_to_keep=features_to_keep,
    111             features_to_drop=features_to_drop, na_values=na_values,
--> 112             custom_preprocessing=custom_preprocessing, metadata=metadata)

/Users/staceyro/anaconda/envs/aif360/lib/python3.7/site-packages/aif360/datasets/standard_dataset.py in __init__(self, df, label_name, favorable_classes, protected_attribute_names, privileged_classes, instance_weights_name, scores_name, categorical_features, features_to_keep, features_to_drop, na_values, custom_preprocessing, metadata)
    119             else:
    120                 # find all instances which match any of the attribute values
--> 121                 priv = np.logical_or.reduce(np.equal.outer(vals, df[attr]))
    122                 df.loc[priv, attr] = privileged_values[0]
    123                 df.loc[~priv, attr] = unprivileged_values[0]

/Users/staceyro/anaconda/envs/aif360/lib/python3.7/site-packages/pandas/core/series.py in __array_ufunc__(self, ufunc, method, *inputs, **kwargs)
    703             return None
    704         else:
--> 705             return construct_return(result)
    706 
    707     def __array__(self, dtype=None) -> np.ndarray:

/Users/staceyro/anaconda/envs/aif360/lib/python3.7/site-packages/pandas/core/series.py in construct_return(result)
    692                 if method == "outer":
    693                     # GH#27198
--> 694                     raise NotImplementedError
    695                 return result
    696             return self._constructor(result, index=index, name=name, copy=False)

NotImplementedError: 

Python: 3.7.2
Pandas: 1.0.3
AIF360: 0.2.3 (built from source)

To replicate:

import aif360
from aif360.datasets import AdultDataset
data = AdultDataset()

AIF360 calculations are off

Hi all,

I am checking the values that AIF360 spits out. I have noticed that these values seem off. I have calculated them on my own using the lines below, where protected and unprotected are lists of zero and one.

sum(protected)/len(protected) = 0.7643312101910829
sum(unprotected)/len(unprotected) = 0.1386481802426343

Disparate impact by AIF360: 3.6549252892407136
Disparate impact by me: 5.512738853503185

Statistical parity difference by AIF360: 0.6256830299484484
Statistical parity difference by me: 0.6256830299484486

What especially strikes me is the difference between the disparate impact "error" and the statistical parity difference "error" as they are using practically the same numbers. Does anyone encoutered this issue before?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.