Giter Site home page Giter Site logo

causalda's Introduction

Pulling Up by the Causal Bootstraps: Causal Data Augmentation for Pre-training Debiasing

Paper

If you use this code in your research, please cite the following publication: https://arxiv.org/abs/2108.12510

@article{gowda2021pulling,
  title={Pulling Up by the Causal Bootstraps: Causal Data Augmentation for Pre-training Debiasing},
  author={Sindhu C.M. Gowda and Shalmali Joshi and Haoran Zhang and Marzyeh Ghassemi},
  journal={arXiv preprint arXiv:2108.12510},
  year={2021}
}

To replicate the experiments in the paper:

Step 0: Environment and Prerequisites

Run the following commands to clone this repo and create the Conda environment:

git clone [email protected]:MLforHealth/CausalDA.git
cd CausalDA/
conda env create -f environment.yml
conda activate causalda

Step 1: Obtaining the Data

See DataSources.md for detailed instructions to setup the WILDS and CXR datasets. This is not necessary for the synthetic experiments.

Step 2: Running Experiments

To train a single model, e.g.

python train_synthetic.py \
    --type par_back_front \
    --corr-coff 0.75 \
    --test-corr 0.75 \
    --output_dir /path/to/output

or

python train.py \
    --type back \
    --data camelyon \
    --data_type Conf \
    --domains 2 3 \
    --corr-coff 0.95 \
    --seed 0 \
    --output_dir /path/to/output

To reproduce the experiments in the paper by training grids of models, call sweep.py using the class names defined in experiments.py as experiment names, e.g.

python sweep.py launch \
    --experiment CXR \
    --output_dir /my/sweep/output/path \
    --command_launcher "local" 

This command can also be ran easily using launch_scripts/launch_exp.sh. You will likely need to update the launcher to fit your compute environment.

Step 3: Aggregating Results

We provide sample code for creating aggregate results for an experiment in AggResults.ipynb.

Acknowledgements

We make use of code from the WILDS benchmark as well as from the DomainBed framework.

License

This source code is released under the MIT license, included here.

causalda's People

Contributors

hzhang0 avatar

Stargazers

Wen Haimei avatar  avatar  avatar Abdelkrim Zitouni avatar Wenhao Ding avatar  avatar 赵荣昌 avatar Sara Magliacane avatar Amlan Kar avatar  avatar Michal Malyska avatar

Watchers

Marzyeh Ghassemi avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.