Giter Site home page Giter Site logo

salimamoukou / mapie Goto Github PK

View Code? Open in Web Editor NEW

This project forked from scikit-learn-contrib/mapie

0.0 0.0 0.0 44.57 MB

A scikit-learn-compatible module for estimating prediction intervals.

Home Page: https://mapie.readthedocs.io/en/latest/

License: BSD 3-Clause "New" or "Revised" License

Python 8.50% Makefile 0.03% Jupyter Notebook 91.47%

mapie's Introduction

GitHubActions_ Codecov_ ReadTheDocs_ License_ PythonVersion_ PyPi_ Conda_ Release_ Commits_ DOI_

image

MAPIE - Model Agnostic Prediction Interval Estimator

MAPIE allows you to easily estimate prediction intervals (or prediction sets) using your favourite scikit-learn-compatible model for single-output regression or multi-class classification settings.

Prediction intervals output by MAPIE encompass both aleatoric and epistemic uncertainties and are backed by strong theoretical guarantees thanks to conformal prediction methods [1-7].

๐Ÿ”— Requirements

Python 3.7+

MAPIE stands on the shoulders of giants.

Its only internal dependencies are scikit-learn and numpy=>1.21.

๐Ÿ›  Installation

Install via `pip`:

$ pip install mapie

or via `conda`:

$ conda install -c conda-forge mapie

To install directly from the github repository :

$ pip install git+https://github.com/scikit-learn-contrib/MAPIE

โšก๏ธ Quickstart

Let us start with a basic regression problem. Here, we generate one-dimensional noisy data that we fit with a linear model.

import numpy as np
from sklearn.linear_model import LinearRegression
from sklearn.datasets import make_regression

regressor = LinearRegression()
X, y = make_regression(n_samples=500, n_features=1, noise=20, random_state=59)

Since MAPIE is compliant with the standard scikit-learn API, we follow the standard sequential fit and predict process like any scikit-learn regressor. We set two values for alpha to estimate prediction intervals at approximately one and two standard deviations from the mean.

from mapie.regression import MapieRegressor
alpha = [0.05, 0.32]
mapie = MapieRegressor(regressor)
mapie.fit(X, y)
y_pred, y_pis = mapie.predict(X, alpha=alpha)

MAPIE returns a np.ndarray of shape (n_samples, 3, len(alpha)) giving the predictions, as well as the lower and upper bounds of the prediction intervals for the target quantile for each desired alpha value.

You can compute the coverage of your prediction intervals.

from mapie.metrics import regression_coverage_score
coverage_scores = [
    regression_coverage_score(y, y_pis[:, 0, i], y_pis[:, 1, i])
    for i, _ in enumerate(alpha)
]

The estimated prediction intervals can then be plotted as follows.

from matplotlib import pyplot as plt
plt.xlabel("x")
plt.ylabel("y")
plt.scatter(X, y, alpha=0.3)
plt.plot(X, y_pred, color="C1")
order = np.argsort(X[:, 0])
plt.plot(X[order], y_pis[order][:, 0, 1], color="C1", ls="--")
plt.plot(X[order], y_pis[order][:, 1, 1], color="C1", ls="--")
plt.fill_between(
    X[order].ravel(),
    y_pis[order][:, 0, 0].ravel(),
    y_pis[order][:, 1, 0].ravel(),
    alpha=0.2
)
plt.title(
    f"Target and effective coverages for "
    f"alpha={alpha[0]:.2f}: ({1-alpha[0]:.3f}, {coverage_scores[0]:.3f})\n"
    f"Target and effective coverages for "
    f"alpha={alpha[1]:.2f}: ({1-alpha[1]:.3f}, {coverage_scores[1]:.3f})"
)
plt.show()

The title of the plot compares the target coverages with the effective coverages. The target coverage, or the confidence interval, is the fraction of true labels lying in the prediction intervals that we aim to obtain for a given dataset. It is given by the alpha parameter defined in MapieRegressor, here equal to 0.05 and 0.32, thus giving target coverages of 0.95 and 0.68. The effective coverage is the actual fraction of true labels lying in the prediction intervals.

image

๐Ÿ“˜ Documentation

The full documentation can be found on this link.

How does MAPIE work on regression ? It is basically based on cross-validation and relies on:

  • Conformity scores on the whole training set obtained by cross-validation,
  • Perturbed models generated during the cross-validation.

MAPIE then combines all these elements in a way that provides prediction intervals on new data with strong theoretical guarantees [1-2].

image

How does MAPIE work on classification ? It is based on the construction of calibrated conformity scores to estimate prediction sets and relies on:

  • Construction of a conformity score
  • Calibration of the conformity score on a calibration set not seen by the model during training

MAPIE then uses the calibrated conformity scores to estimate sets of labels associated with the desired coverage on new data with strong theoretical guarantees [3-4-5].

image

๐Ÿ“ Contributing

You are welcome to propose and contribute new ideas. We encourage you to open an issue so that we can align on the work to be done. It is generally a good idea to have a quick discussion before opening a pull request that is potentially out-of-scope. For more information on the contribution process, please go here.

๐Ÿค Affiliations

MAPIE has been developed through a collaboration between Quantmetry, Michelin, ENS Paris-Saclay, and with the financial support from Rรฉgion Ile de France and Confiance.ai.

Quantmetry_ Michelin_ ENS_ Confiance.ai_ IledeFrance_

๐Ÿ” References

MAPIE methods belong to the field of conformal inference.

[1] Rina Foygel Barber, Emmanuel J. Candรจs, Aaditya Ramdas, and Ryan J. Tibshirani. "Predictive inference with the jackknife+." Ann. Statist., 49(1):486โ€“507, February 2021.

[2] Byol Kim, Chen Xu, and Rina Foygel Barber. "Predictive Inference Is Free with the Jackknife+-after-Bootstrap." 34th Conference on Neural Information Processing Systems (NeurIPS 2020).

[3] Mauricio Sadinle, Jing Lei, and Larry Wasserman. "Least Ambiguous Set-Valued Classifiers With Bounded Error Levels." Journal of the American Statistical Association, 114:525, 223-234, 2019.

[4] Yaniv Romano, Matteo Sesia and Emmanuel J. Candรจs. "Classification with Valid and Adaptive Coverage." NeurIPS 2020 (spotlight).

[5] Anastasios Nikolas Angelopoulos, Stephen Bates, Michael Jordan and Jitendra Malik. "Uncertainty Sets for Image Classifiers using Conformal Prediction." International Conference on Learning Representations 2021.

[6] Yaniv Romano, Evan Patterson, Emmanuel J. Candรจs. "Conformalized Quantile Regression." Advances in neural information processing systems 32 (2019).

[7] Chen Xu and Yao Xie. "Conformal Prediction Interval for Dynamic Time-Series." International Conference on Machine Learning (ICML, 2021).

[8] Lihua Lei Jitendra Malik Stephen Bates, Anastasios Angelopoulos and Michael I. Jordan. Distribution-free, risk-controlling prediction sets. CoRR, abs/2101.02703, 2021. URL https://arxiv.org/abs/2101.02703.39

[9] Angelopoulos, Anastasios N., Stephen, Bates, Adam, Fisch, Lihua, Lei, and Tal, Schuster. "Conformal Risk Control." (2022).

๐Ÿ“ License

MAPIE is free and open-source software licensed under the 3-clause BSD license.

mapie's People

Contributors

lacombelouis avatar vtaquet avatar vincentblot28 avatar gmartinonqm avatar kapytaine avatar remiadon avatar adirthaborgohain avatar alize-papp avatar andreapi avatar cmougan avatar aagoumbala avatar tmorzade avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.