DrWhy.AI - the collection of tools for Visual Exploration, Explanation and Debugging of Predictive Models
It takes a village to raise a child model.
The way how we do predictive modeling is very ineffective. We spend way too much time on manual time consuming and easy to automate activities like data cleaning and exploration, crisp modeling, model validation. Instead of focusing on model understanding, productisation and communication.
Here we gather tools that can be use to make out work more efficient through the whole model lifecycle. The unified grammar beyond DrWhy.AI universe is described in the Predictive Models: Visual Exploration, Explanation and Debugging book.
The DrWhy is based on an unified Model Development Process based on RUP. Find an overview in the diagram below.
Tools that are usefull during the model lifetime. stands for our internal tools.
- dataMaid; A Suite of Checks for Identification of Potential Errors in a Data Frame as Part of the Data Screening Process
- inspectdf; A collection of utilities for columnwise summary, comparison and visualisation of data frames.
- validate; Professional data validation for the R environment
- errorlocate; Find and replace erroneous fields in data using validation rules
- ggplot2; System for declaratively creating graphics, based on The Grammar of Graphics.
- Model Agnostic Variable Importance Scores. Surrogate learning = Train an elastic model and measure feature importance in such model. See DALEX, Model Class Reliance MCR
- vip Variable importance plots
- SAFE Surrogate learning = Train an elastic model and extract feature transformations.
- xspliner Using surrogate black-boxes to train interpretable spline based additive models
- factorMerger Set of tools for factors merging paper
- ingredients Set of tools for model level feature effects and feature importance.
- auditor model verification, validation, and error analysis vigniette
- DALEX Descriptive mAchine Learning EXplanations
- iml; interpretable machine learning R package
- randomForestExplainer A set of tools to understand what is happening inside a Random Forest
- survxai Explanations for survival models paper
- breakDown, pyBreakDown and breakDown2 Model Agnostic Explainers for Individual Predictions (with interactions)
- ceterisParibus, pyCeterisParibus, ceterisParibusD3 and ceterisParibus2 Ceteris Paribus Plots (What-If plots) for explanations of a single observation
- localModel and live LIME-like explanations with interpretable features based on Ceteris Paribus curves.
- lime; Local Interpretable Model-Agnostic Explanations (R port of original Python package)
- shapper An R wrapper of SHAP python library
- modelDown modelDown generates a website with HTML summaries for predictive models
- drifter Concept Drift and Concept Shift Detection for Predictive Models
- archivist A set of tools for datasets and plots archiving paper
These packages are actively developed and have active maintainer.
- archivist (maintainer: pbiecek)
- DALEX (maintainer: pbiecek)
- auditor (maintainer: agosiewska)
- survxai (maintainer: agosiewska)
- shapper (maintainer: maksymiuks)
- iBreakDown (maintainer: pbiecek)
- ingredients (maintainer: pbiecek)
- drifter (maintainer: pbiecek)
- localModel (maintainer: mstaniak)
- modelDown (maintainer: magda-tatarynowicz)
- vivo (maintainer: kozaka93)
- EIX (maintainer: ekarbowiak)
- xspliner (maintainer: krystian8207)
- pyDALEX (maintainer: magda-tatarynowicz)
- SAFE for R (maintainer: AnnaGierlak)
- SAFE for Python (maintainer: olagacek)
- pyCeterisParibus (maintainer: kmichael08)
- ceterisParibusD3 (maintainer: flaminka)
These packages contain useful features, are still in use but we are looking for an active maintainer.
Key features from these packages are copied to another packages.
- ceterisParibus (development moved to
ingredients
) - ceterisParibus2 (development moved to
ingredients
) - DALEX2 (development moved to
DALEX
) - breakDown (development moved to
iBreakDown
) - live (development moved to
localModel
)
DrWhy
works on fully trained predictive models. Models can be created with any tool.
DrWhy
uses DALEX2
package to wrap model with additional metadata required for explanations, like validation data, predict function etc.
Explainers for predictive models can be created with model agnostic or model specific functions implemented in various packages.