Giter Site home page Giter Site logo

yromano / cqr Goto Github PK

View Code? Open in Web Editor NEW
223.0 8.0 44.0 5.14 MB

Conformalized Quantile Regression

Home Page: https://sites.google.com/view/cqr

License: Other

Python 27.88% Jupyter Notebook 71.89% R 0.22%
conformal-prediction quantile-regression deep-learning pytorch random-forest-regression prediction prediction-intervals fairness conformal-methods algorithmic-fairness

cqr's Introduction

Reliable Predictive Inference

An important factor to guarantee a responsible use of data-driven recommendation systems is that we should be able to communicate their uncertainty to decision makers. This can be accomplished by constructing prediction intervals, which provide an intuitive measure of the limits of predictive performance.

This package contains a Python implementation of Conformalized quantile regression (CQR) [1] methodology for constructing marginal distribusion-free prediction intervals. It also implements the equalized coverage framework [2] that builds valid group-conditional prediction intervals.

Conformalized Quantile Regression [1]

CQR is a technique for constructing prediction intervals that attain valid coverage in finite samples, without making distributional assumptions. It combines the statistical efficiency of quantile regression with the distribution-free coverage guarantee of conformal prediction. On one hand, CQR is flexible in that it can wrap around any algorithm for quantile regression, including random forests and deep neural networks. On the other hand, a key strength of CQR is its rigorous control of the miscoverage rate, independent of the underlying regression algorithm.

[1] Yaniv Romano, Evan Patterson, and Emmanuel J. Candes, “Conformalized quantile regression.” 2019.

Equalized Coverage [2]

To support equitable treatment, the equalized coverage methodology forces the construction of the prediction intervals to be unbiased in the sense that their coverage must be equal across all protected groups of interest. Similar to CQR and conformal inference, equalized coverage offers rigorous distribution-free guarantees that hold in finite samples. This methodology can also be viewed as a wrapper around any predictive algorithm.

[2] Y. Romano, R. F. Barber, C. Sabbatti and E. J. Candès, “With malice towards none: Assessing uncertainty via equalized coverage.” 2019.

Getting Started

This package is self-contained and implemented in python.

Part of the code is a taken from the nonconformist package available at https://github.com/donlnz/nonconformist. One may refer to the nonconformist repository to view other applications of conformal prediction.

Prerequisites

  • python
  • numpy
  • scipy
  • scikit-learn
  • scikit-garden
  • pytorch
  • pandas

Installing

The development version is available here on github:

git clone https://github.com/yromano/cqr.git

Usage

CQR

Please refer to cqr_real_data_example.ipynb for basic usage. Comparisons to competitive methods and additional usage examples of this package can be found in cqr_synthetic_data_example_1.ipynb and cqr_synthetic_data_example_2.ipynb.

Equalized Coverage

The notebook detect_prediction_bias_example.ipynb performs simple data analysis for MEPS 21 data set and detects bias in the prediction. The notebook equalized_coverage_example.ipynb illustrates how to run the methods proposed in [2] and construct prediction intervals with equal coverage across groups.

Reproducible Research

The code available under /reproducible_experiments/ in the repository replicates the experimental results in [1] and [2].

Publicly Available Datasets

  • Blog: BlogFeedback data set.

  • Bio: Physicochemical properties of protein tertiary structure data set.

  • Bike: Bike sharing dataset data set.

  • Community: Communities and crime data set.

  • STAR: C.M. Achilles, Helen Pate Bain, Fred Bellott, Jayne Boyd-Zaharias, Jeremy Finn, John Folger, John Johnston, and Elizabeth Word. Tennessee’s Student Teacher Achievement Ratio (STAR) project, 2008.

  • Concrete: Concrete compressive strength data set.

  • Facebook Variant 1 and Variant 2: Facebook comment volume data set.

Data subject to copyright/usage rules

The Medical Expenditure Panel Survey (MPES) data can be downloaded using the code in the folder /get_meps_data/ under this repository. It is based on this explanation (code provided by IBM's AIF360).

  • MEPS_19: Medical expenditure panel survey, panel 19.

  • MEPS_20: Medical expenditure panel survey, panel 20.

  • MEPS_21: Medical expenditure panel survey, panel 21.

License

This project is licensed under the MIT License - see the LICENSE file for details.

cqr's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cqr's Issues

Building current.

Not sure this is still updated but would be nice to integrate the build with the current packages.

data exchangeably

@yromano In conformal prediction, the regression algorithm must treat the data exchangeably. I want to know whether CQR must treat the data exchangeably.

QuantileRegErrFunc does not exist

In one of the examples there is this line: from nonconformist.nc import QuantileRegErrFunc

This error function does not exist in that library.

A problem about scikit-garden

When I install scikit-garden, it keeps prompting me that the installation has failed, does anyone have the same problem? If solved, can you share the solution?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.