This curated list contains 900 awesome open-source projects with a total of 3.3M stars grouped into 34 categories. All projects are ranked by a project-quality score, which is calculated based on various metrics automatically collected from GitHub and different package managers. If you like to add or update projects, feel free to open an issue, submit a pull request, or directly edit the projects.yaml. Contributions are very welcome!
Contents
- Machine Learning Frameworks 56 projects
- Data Visualization 50 projects
- Text Data & NLP 96 projects
- Image Data 60 projects
- Graph Data 36 projects
- Audio Data 28 projects
- Geospatial Data 22 projects
- Financial Data 25 projects
- Time Series Data 26 projects
- Medical Data 19 projects
- Tabular Data 5 projects
- Optical Character Recognition 12 projects
- Data Containers & Structures 0 projects
- Data Loading & Extraction 2 projects
- Web Scraping & Crawling 1 projects
- Data Pipelines & Streaming 43 projects
- Distributed Machine Learning 33 projects
- Hyperparameter Optimization & AutoML 47 projects
- Reinforcement Learning 23 projects
- Recommender Systems 16 projects
- Privacy Machine Learning 6 projects
- Workflow & Experiment Tracking 39 projects
- Model Serialization & Deployment 16 projects
- Model Interpretability 50 projects
- Vector Similarity Search (ANN) 12 projects
- Probabilistics & Statistics 22 projects
- Adversarial Robustness 9 projects
- GPU Utilities 18 projects
- Tensorflow Utilities 15 projects
- Jax Utilities 2 projects
- Sklearn Utilities 17 projects
- Pytorch Utilities 32 projects
- Database Clients 1 projects
- Others 61 projects
Explanation
🥇 🥈 🥉 Combined project-quality score⭐️ Star count from GitHub🐣 New project (less than 6 months old)💤 Inactive project (6 months no activity)💀 Dead project (12 months no activity)📈 📉 Project is trending up or down➕ Project was recently added❗️ Warning (e.g. missing/risky license)👨💻 Contributors count from GitHub🔀 Fork count from GitHub📋 Issue count from GitHub⏱️ Last update timestamp on package manager📥 Download count from package manager📦 Number of dependent projects- Tensorflow related project
- Sklearn related project
- PyTorch related project
- MxNet related project
- Apache Spark related project
- Jupyter related project
- PaddlePaddle related project
- Pandas related project
- Jax related project
Machine Learning Frameworks
General-purpose machine learning and deep learning frameworks.
Tensorflow (🥇 55 · ⭐ 170K) - An Open Source Machine Learning Framework for Everyone. Apache-2
-
GitHub (
👨💻 4K ·🔀 87K ·📦 190K ·📋 35K - 7% open ·⏱️ 05.05.2022):git clone https://github.com/tensorflow/tensorflow
-
PyPi (
📥 14M / month ·📦 14K ·⏱️ 04.05.2022):pip install tensorflow
-
Conda (
📥 3.3M ·⏱️ 06.02.2022):conda install -c conda-forge tensorflow
-
Docker Hub (
📥 65M ·⭐ 2K ·⏱️ 05.05.2022):docker pull tensorflow/tensorflow
scikit-learn (🥇 51 · ⭐ 50K) - scikit-learn: machine learning in Python. BSD-3
XGBoost (🥇 44 · ⭐ 23K) - Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or.. Apache-2
StatsModels (🥇 43 · ⭐ 7.3K) - Statsmodels: statistical modeling and econometrics in Python. BSD-3
pytorch-lightning (🥈 42 · ⭐ 18K) - The lightweight PyTorch wrapper for high-performance.. Apache-2
LightGBM (🥈 42 · ⭐ 14K) - A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT,.. MIT
PaddlePaddle (🥈 41 · ⭐ 18K) - PArallel Distributed Deep LEarning: Machine Learning.. Apache-2
Catboost (🥈 40 · ⭐ 6.5K) - A fast, scalable, high performance Gradient Boosting on Decision.. Apache-2
Jina (🥈 38 · ⭐ 15K) - Cloud-native neural search framework for kind of data. Apache-2
-
GitHub (
👨💻 150 ·🔀 1.9K ·📦 270 ·📋 1.4K - 4% open ·⏱️ 05.05.2022):git clone https://github.com/jina-ai/jina
-
PyPi (
📥 48K / month ·⏱️ 05.05.2022):pip install jina
-
Conda (
📥 4.3K ·⏱️ 22.04.2022):conda install -c conda-forge jina-core
-
Docker Hub (
📥 1.1M ·⭐ 7 ·⏱️ 05.05.2022):docker pull jinaai/jina
Theano (🥈 38 · ⭐ 9.6K) - Theano was a Python library that allows you to define, optimize, and.. BSD-3
Thinc (🥈 37 · ⭐ 2.5K) - A refreshing functional take on deep learning, compatible with your favorite.. MIT
Vowpal Wabbit (🥈 34 · ⭐ 7.9K) - Vowpal Wabbit is a machine learning system which pushes the.. BSD-3
tensorflow-upstream (🥈 33 · ⭐ 590) - TensorFlow ROCm port. Apache-2
Turi Create (🥉 32 · ⭐ 11K) - Turi Create simplifies the development of custom machine learning.. BSD-3
tensorpack (🥉 32 · ⭐ 6.2K) - A Neural Net Training Interface on TensorFlow, with focus.. Apache-2
einops (🥉 31 · ⭐ 5K) - Deep learning operations reinvented (for pytorch, tensorflow, jax and others). MIT
Neural Network Libraries (🥉 30 · ⭐ 2.5K) - Neural Network Libraries. Apache-2
Neural Tangents (🥉 26 · ⭐ 1.8K) - Fast and Easy Infinite Neural Networks in Python. Apache-2
mace (🥉 23 · ⭐ 4.6K) - MACE is a deep learning inference framework optimized for mobile.. Apache-2
-
GitHub (
👨💻 63 ·🔀 790 ·📥 1.4K ·📋 660 - 6% open ·⏱️ 11.02.2022):git clone https://github.com/XiaoMi/mace
Towhee (🥉 23 · ⭐ 430) - A framework that provides a simple API for developing ML-driven data.. Apache-2
ThunderSVM (🥉 20 · ⭐ 1.4K) - ThunderSVM: A Fast SVM Library on GPUs and CPUs. Apache-2
chefboost (🥉 19 · ⭐ 320) - A Lightweight Decision Tree Framework supporting regular algorithms:.. MIT
Show 13 hidden projects...
- dlib (
🥈 39 ·⭐ 11K) - A toolkit for making real world machine learning and data analysis..❗️BSL-1.0
- TFlearn (
🥉 32 ·⭐ 9.6K ·💀 ) - Deep learning library featuring a higher-level API for TensorFlow.MIT
- CNTK (
🥉 31 ·⭐ 17K ·💀 ) - Microsoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit.MIT
- Lasagne (
🥉 29 ·⭐ 3.8K ·💀 ) - Lightweight library to build and train neural networks in Theano.MIT
- MindsDB (
🥉 28 ·⭐ 6.6K) - In-Database Machine Learning.❗️GPL-3.0
- NuPIC (
🥉 28 ·⭐ 6.3K ·💀 ) - Numenta Platform for Intelligent Computing is an implementation..❗️AGPL-3.0
- SHOGUN (
🥉 26 ·⭐ 2.9K ·💀 ) - Unified and efficient Machine Learning.BSD-3
- xLearn (
🥉 25 ·⭐ 3K ·💀 ) - High performance, easy-to-use, and scalable machine learning (ML)..Apache-2
- NeuPy (
🥉 25 ·⭐ 710 ·💀 ) - NeuPy is a Tensorflow based python library for prototyping and building..MIT
- neon (
🥉 23 ·⭐ 3.9K ·💀 ) - Intel Nervana reference deep learning framework committed to best..Apache-2
- Torchbearer (
🥉 22 ·⭐ 630 ·💀 ) - torchbearer: A model fitting library for PyTorch.MIT
- ThunderGBM (
🥉 16 ·⭐ 620 ·💀 ) - ThunderGBM: Fast GBDTs and Random Forests on GPUs.Apache-2
- StarSpace (
🥉 15 ·⭐ 3.8K ·💀 ) - Learning embeddings for classification, retrieval and ranking.MIT
Data Visualization
General-purpose and task-specific data visualization libraries.
Matplotlib (🥇 49 · ⭐ 15K) - matplotlib: plotting with Python. Python-2.0
Plotly (🥇 42 · ⭐ 11K) - The interactive graphing library for Python (includes Plotly Express). MIT
-
GitHub (
👨💻 200 ·🔀 2.1K ·📦 9 ·📋 2.3K - 49% open ·⏱️ 04.05.2022):git clone https://github.com/plotly/plotly.py
-
PyPi (
📥 7.3M / month ·📦 4K ·⏱️ 05.04.2022):pip install plotly
-
Conda (
📥 2.5M ·⏱️ 05.04.2022):conda install -c conda-forge plotly
-
npm (
📥 44K / month ·📦 4 ·⏱️ 12.01.2021):npm install plotlywidget
dash (🥇 39 · ⭐ 16K) - Analytical Web Apps for Python, R, Julia, and Jupyter. No JavaScript Required. MIT
pandas-profiling (🥈 37 · ⭐ 8.9K · 📈 ) - Create HTML profiling reports from pandas DataFrame.. MIT
HoloViews (🥈 35 · ⭐ 2.2K) - With Holoviews, your data visualizes itself. BSD-3
-
GitHub (
👨💻 120 ·🔀 350 ·📋 2.8K - 30% open ·⏱️ 05.05.2022):git clone https://github.com/holoviz/holoviews
-
PyPi (
📥 430K / month ·📦 210 ·⏱️ 04.05.2022):pip install holoviews
-
Conda (
📥 690K ·⏱️ 16.02.2022):conda install -c conda-forge holoviews
-
npm (
📥 1.8K / month ·⏱️ 24.05.2020):npm install @pyviz/jupyterlab_pyviz
datashader (🥈 32 · ⭐ 2.8K) - Quickly and accurately render even the largest data. BSD-3
Perspective (🥈 31 · ⭐ 4.5K) - A data visualization and analytics component, especially.. Apache-2
-
GitHub (
👨💻 68 ·🔀 460 ·📦 240 ·📋 520 - 15% open ·⏱️ 01.05.2022):git clone https://github.com/finos/perspective
-
PyPi (
📥 1.8K / month ·📦 9 ·⏱️ 14.03.2022):pip install perspective-python
-
Conda (
📥 52K ·⏱️ 29.04.2022):conda install -c conda-forge perspective
-
npm (
📥 2.5K / month ·⏱️ 01.05.2022):npm install @finos/perspective-jupyterlab
bqplot (🥈 31 · ⭐ 3.3K) - Plotting library for IPython/Jupyter notebooks. Apache-2
-
GitHub (
👨💻 56 ·🔀 460 ·📦 30 ·📋 580 - 39% open ·⏱️ 08.04.2022):git clone https://github.com/bqplot/bqplot
-
PyPi (
📥 69K / month ·📦 92 ·⏱️ 11.02.2022):pip install bqplot
-
Conda (
📥 950K ·⏱️ 11.02.2022):conda install -c conda-forge bqplot
-
npm (
📥 22K / month ·📦 10 ·⏱️ 11.02.2022):npm install bqplot
D-Tale (🥉 30 · ⭐ 3.4K) - Visualizer for pandas data structures. ❗️LGPL-2.1
data-validation (🥉 29 · ⭐ 630) - Library for exploring and validating machine learning.. Apache-2
hvPlot (🥉 29 · ⭐ 550) - A high-level plotting API for pandas, dask, xarray, and networkx built on.. BSD-3
Facets Overview (🥉 27 · ⭐ 6.8K · 💤 ) - Visualizations for machine learning datasets. Apache-2
HyperTools (🥉 27 · ⭐ 1.7K) - A Python toolbox for gaining geometric insights into high-dimensional.. MIT
pythreejs (🥉 27 · ⭐ 810) - A Jupyter - Three.js bridge. BSD-3
-
GitHub (
👨💻 29 ·🔀 180 ·📦 19 ·📋 220 - 33% open ·⏱️ 06.12.2021):git clone https://github.com/jupyter-widgets/pythreejs
-
PyPi (
📥 50K / month ·📦 38 ·⏱️ 26.02.2021):pip install pythreejs
-
Conda (
📥 380K ·⏱️ 02.03.2021):conda install -c conda-forge pythreejs
-
npm (
📥 5.2K / month ·📦 7 ·⏱️ 26.02.2021):npm install jupyter-threejs
Sweetviz (🥉 23 · ⭐ 2K · 💤 ) - Visualize and compare datasets, target values and associations, with.. MIT
AutoViz (🥉 23 · ⭐ 700) - Automatically Visualize any dataset, any size with a single line of.. Apache-2
Pandas-Bokeh (🥉 22 · ⭐ 770) - Bokeh Plotting Backend for Pandas and GeoPandas. MIT
python-ternary (🥉 22 · ⭐ 550) - Ternary plotting library for python with matplotlib. MIT
Show 13 hidden projects...
- cartopy (
🥈 31 ·⭐ 1K) - Cartopy - a cartographic python library with matplotlib support.❗️LGPL-3.0
- Cufflinks (
🥉 29 ·⭐ 2.6K ·💀 ) - Productivity Tools for Plotly + Pandas.MIT
- Multicore-TSNE (
🥉 25 ·⭐ 1.7K ·💀 ) - Parallel t-SNE implementation with Python and Torch..BSD-3
- Chartify (
🥉 24 ·⭐ 3.1K ·💀 ) - Python library that makes it easy for data scientists to create..Apache-2
- pivottablejs (
🥉 23 ·⭐ 460 ·💀 ) - Dragndrop Pivot Tables and Charts for Jupyter/IPython..MIT
- PandasGUI (
🥉 22 ·⭐ 2.6K) - A GUI for Pandas DataFrames.❗️MIT-0
- PDPbox (
🥉 22 ·⭐ 670 ·💀 ) - python partial dependence plot toolbox.MIT
- ivis (
🥉 19 ·⭐ 260) - Dimensionality reduction in very large datasets using Siamese..Apache-2
- animatplot (
🥉 17 ·⭐ 390 ·💀 ) - A python package for animating plots build on matplotlib.MIT
- pdvega (
🥉 16 ·⭐ 340 ·💀 ) - Interactive plotting for Pandas using Vega-Lite.MIT
- data-describe (
🥉 16 ·⭐ 290) - datadescribe: Pythonic EDA Accelerator for Data Science.Apache-2
- nx-altair (
🥉 16 ·⭐ 190 ·💀 ) - Draw interactive NetworkX graphs with Altair.MIT
- nptsne (
🥉 14 ·⭐ 28 ·💀 ) - nptsne is a numpy compatible python binary package that offers a..Apache-2
Text Data & NLP
Libraries for processing, cleaning, manipulating, and analyzing text data as well as libraries for NLP tasks such as language detection, fuzzy matching, classification, seq2seq learning, conversational AI, keyword extraction, and translation.
transformers (🥇 49 · ⭐ 62K) - Transformers: State-of-the-art Machine Learning for.. Apache-2
nltk (🥇 44 · ⭐ 11K) - Suite of libraries and programs for symbolic and statistical natural.. Apache-2
gensim (🥇 42 · ⭐ 13K · 📈 ) - Topic Modelling for Humans. ❗️LGPL-2.1
flair (🥇 38 · ⭐ 12K) - A very simple framework for state-of-the-art Natural Language Processing.. MIT
ChatterBot (🥇 36 · ⭐ 12K · 💤 ) - ChatterBot is a machine learning, conversational dialog engine.. BSD-3
sentence-transformers (🥈 34 · ⭐ 7.6K · 📉 ) - Multilingual Sentence & Image Embeddings with BERT. Apache-2
sentencepiece (🥈 34 · ⭐ 5.8K) - Unsupervised text tokenizer for Neural Network-based text.. Apache-2
Tokenizers (🥈 34 · ⭐ 5.6K) - Fast State-of-the-Art Tokenizers optimized for Research and.. Apache-2
TensorFlow Text (🥈 32 · ⭐ 930) - Making text a first-class citizen in TensorFlow. Apache-2
DeepPavlov (🥈 31 · ⭐ 5.7K) - An open source library for deep learning end-to-end dialog.. Apache-2
snowballstemmer (🥈 31 · ⭐ 560) - Snowball compiler and stemming algorithms. BSD-3
haystack (🥈 30 · ⭐ 4.6K) - Haystack is an open source NLP framework that leverages Transformer.. Apache-2
SciSpacy (🥈 30 · ⭐ 1.2K) - A full spaCy pipeline and models for scientific/biomedical documents. Apache-2
vaderSentiment (🥈 28 · ⭐ 3.6K) - VADER Sentiment Analysis. VADER (Valence Aware Dictionary and.. MIT
TextDistance (🥈 28 · ⭐ 2.8K) - Compute distance between sequences. 30+ algorithms, pure python.. MIT
neuralcoref (🥈 28 · ⭐ 2.5K · 💤 ) - Fast Coreference Resolution in spaCy with Neural Networks. MIT
PyTextRank (🥈 28 · ⭐ 1.8K) - Python implementation of TextRank algorithms (textgraphs) for phrase.. MIT
Ciphey (🥉 27 · ⭐ 9.8K) - Automatically decrypt encryptions without knowing the key or cipher,.. MIT
-
GitHub (
👨💻 46 ·🔀 600 ·📋 290 - 17% open ·⏱️ 03.11.2021):git clone https://github.com/Ciphey/Ciphey
-
PyPi (
📥 9.5K / month ·⏱️ 06.06.2021):pip install ciphey
-
Docker Hub (
📥 15K ·⭐ 6 ·⏱️ 14.04.2022):docker pull remnux/ciphey
fastNLP (🥉 27 · ⭐ 2.6K) - fastNLP: A Modularized and Extensible NLP Framework. Currently still.. Apache-2
spacy-transformers (🥉 27 · ⭐ 1.1K) - Use pretrained transformers like BERT, XLNet and GPT-2.. MIT
spacy
english-words (🥉 26 · ⭐ 7.1K) - A text file containing 479k English words for all your.. Unlicense
scattertext (🥉 25 · ⭐ 1.8K) - Beautiful visualizations of how language differs among document.. Apache-2
pytorch-nlp (🥉 24 · ⭐ 2.1K · 💤 ) - Basic Utilities for PyTorch Natural Language Processing.. BSD-3
Texthero (🥉 22 · ⭐ 2.5K · 💤 ) - Text preprocessing, representation and visualization from zero to.. MIT
rubrix (🥉 22 · ⭐ 1K) - Rubrix, open-source framework for data-centric NLP. Data annotation and.. Apache-2
DeepMatcher (🥉 21 · ⭐ 4.2K · 💤 ) - Python package for performing Entity and Text Matching using.. BSD-3
NLP Architect (🥉 21 · ⭐ 2.8K · 💤 ) - A model library for exploring state-of-the-art deep.. Apache-2
lightseq (🥉 21 · ⭐ 2.1K) - LightSeq: A High Performance Library for Sequence Processing and.. Apache-2
OpenPrompt (🥉 21 · ⭐ 1.4K) - An Open-Source Framework for Prompt-Learning. Apache-2
gpt-2-simple (🥉 20 · ⭐ 2.9K · 💤 ) - Python package to easily retrain OpenAIs GPT-2 text-.. MIT
qdrant (🥉 20 · ⭐ 1.4K) - Qdrant - vector similarity search engine with extended filtering.. Apache-2
-
GitHub (
👨💻 24 ·🔀 86 ·📋 190 - 25% open ·⏱️ 05.05.2022):git clone https://github.com/qdrant/qdrant
OpenNRE (🥉 16 · ⭐ 3.6K) - An Open-Source Package for Neural Relation Extraction (NRE). MIT
-
GitHub (
👨💻 10 ·🔀 930 ·📋 350 - 5% open ·⏱️ 06.04.2022):git clone https://github.com/thunlp/OpenNRE
Show 28 hidden projects...
- fuzzywuzzy (
🥈 32 ·⭐ 8.7K ·💤 ) - Fuzzy String Matching in Python.❗️GPL-2.0
- langid (
🥉 27 ·⭐ 2K ·💀 ) - Stand-alone language identification system.BSD-3
- flashtext (
🥉 25 ·⭐ 5.2K ·💀 ) - Extract Keywords from sentence or Replace keywords in sentences.MIT
- polyglot (
🥉 25 ·⭐ 2K ·💀 ) - Multilingual text (NLP) processing toolkit.❗️GPL-3.0
- textgenrnn (
🥉 24 ·⭐ 4.7K ·💀 ) - Easily train your own text-generating neural network of any..MIT
- whoosh (
🥉 24 ·⭐ 220) - Pure-Python full-text search library.❗️BSD-1-Clause
- YouTokenToMe (
🥉 23 ·⭐ 800 ·💀 ) - Unsupervised text tokenizer focused on computational efficiency.MIT
- pySBD (
🥉 23 ·⭐ 440 ·💀 ) - pySBD (Python Sentence Boundary Disambiguation) is a rule-based sentence..MIT
- Texar (
🥉 22 ·⭐ 2.3K ·💀 ) - Toolkit for Machine Learning, Natural Language Processing, and..Apache-2
- happy-transformer (
🥉 22 ·⭐ 280) - A package built on top of Hugging Faces transformers..Apache-2
huggingface
- DELTA (
🥉 21 ·⭐ 1.5K ·💀 ) - DELTA is a deep learning based natural language and speech..Apache-2
- anaGo (
🥉 21 ·⭐ 1.4K ·💀 ) - Bidirectional LSTM-CRF and ELMo for Named-Entity Recognition,..MIT
- stop-words (
🥉 21 ·⭐ 140 ·💀 ) - Get list of common stop words in various languages in Python.BSD-3
- fastT5 (
🥉 20 ·⭐ 290) - boost inference speed of T5 models by 5x & reduce the model size by 3x.Apache-2
- pyfasttext (
🥉 20 ·⭐ 230 ·💀 ) - Yet another Python binding for fastText.❗️GPL-3.0
- textpipe (
🥉 19 ·⭐ 300 ·💤 ) - Textpipe: clean and extract metadata from text.MIT
- NeuroNER (
🥉 17 ·⭐ 1.6K ·💀 ) - Named-entity recognition using neural networks. Easy-to-use and..MIT
- nboost (
🥉 17 ·⭐ 620 ·💀 ) - NBoost is a scalable, search-api-boosting platform for deploying..Apache-2
- textaugment (
🥉 17 ·⭐ 240 ·💤 ) - TextAugment: Text Augmentation Library.MIT
- skift (
🥉 16 ·⭐ 230) - scikit-learn wrappers for Python fastText.MIT
- BLINK (
🥉 15 ·⭐ 860 ·💀 ) - Entity Linker solution.MIT
- NeuralQA (
🥉 15 ·⭐ 220 ·💀 ) - NeuralQA: A Usable Library for Question Answering on Large Datasets..MIT
- spacy-dbpedia-spotlight (
🥉 15 ·⭐ 58) - A spaCy wrapper for DBpedia Spotlight.MIT
spacy
- Headliner (
🥉 14 ·⭐ 230 ·💀 ) - Easy training and deployment of seq2seq models.MIT
- textvec (
🥉 14 ·⭐ 180 ·💀 ) - Text vectorization tool to outperform TFIDF for classification..MIT
- numerizer (
🥉 14 ·⭐ 140) - A Python module to convert natural language numerics into ints and..MIT
- TransferNLP (
🥉 13 ·⭐ 290 ·💀 ) - NLP library designed for reproducible experimentation..MIT
- ONNX-T5 (
🥉 13 ·⭐ 200 ·💀 ) - Summarization, translation, sentiment-analysis, text-generation..Apache-2
Image Data
Libraries for image & video processing, manipulation, and augmentation as well as libraries for computer vision tasks such as facial recognition, object detection, and classification.
scikit-image (🥇 44 · ⭐ 4.9K) - Image processing in Python. BSD-2
torchvision (🥇 42 · ⭐ 12K) - Datasets, Transforms and Models specific to Computer Vision. BSD-3
MMDetection (🥇 37 · ⭐ 20K) - OpenMMLab Detection Toolbox and Benchmark. Apache-2
PyTorch Image Models (🥇 37 · ⭐ 18K) - PyTorch image models, scripts, pretrained weights --.. Apache-2
InsightFace (🥈 34 · ⭐ 12K) - State-of-the-art 2D and 3D Face Analysis Project. MIT
opencv-python (🥈 34 · ⭐ 2.7K) - Automated CI toolchain to produce precompiled opencv-python,.. MIT
Face Recognition (🥈 33 · ⭐ 44K · 💤 ) - The worlds simplest facial recognition api for Python.. MIT
detectron2 (🥈 33 · ⭐ 21K) - Detectron2 is a platform for object detection, segmentation.. Apache-2
Albumentations (🥈 32 · ⭐ 10K) - Fast image augmentation library and an easy-to-use wrapper.. MIT
PaddleDetection (🥈 31 · ⭐ 7.5K) - Object Detection toolkit based on PaddlePaddle. It.. Apache-2
imageai (🥈 30 · ⭐ 7K · 💤 ) - A python library built to empower developers to build applications and.. MIT
vit-pytorch (🥈 29 · ⭐ 9.8K) - Implementation of Vision Transformer, a simple way to achieve.. MIT
Face Alignment (🥉 27 · ⭐ 5.7K · 💤 ) - 2D and 3D Face alignment library build using pytorch. BSD-3
vidgear (🥉 27 · ⭐ 2.2K) - A High-performance cross-platform Video Processing Python framework.. Apache-2
sahi (🥉 27 · ⭐ 1.6K) - A lightweight vision library for performing large scale object detection/.. MIT
layout-parser (🥉 26 · ⭐ 3K) - A Unified Toolkit for Deep Learning Based Document Image.. Apache-2
CellProfiler (🥉 26 · ⭐ 670) - An open-source application for biological image analysis. BSD-3
facenet-pytorch (🥉 25 · ⭐ 2.8K) - Pretrained Pytorch face detection (MTCNN) and facial.. MIT
pytorchvideo (🥉 25 · ⭐ 2.4K) - A deep learning library for video understanding research. Apache-2
icevision (🥉 25 · ⭐ 680) - An Agnostic Computer Vision Framework - Pluggable to any Training.. Apache-2
tensorflow-graphics (🥉 24 · ⭐ 2.6K) - TensorFlow Graphics: Differentiable Graphics Layers.. Apache-2
deep-daze (🥉 23 · ⭐ 4.2K) - Simple command line tool for text to image generation using OpenAIs.. MIT
Image Super-Resolution (🥉 23 · ⭐ 3.6K · 💤 ) - Super-scale your images and run experiments with.. Apache-2
-
GitHub (
👨💻 10 ·🔀 620 ·📦 82 ·📋 200 - 44% open ·⏱️ 02.06.2021):git clone https://github.com/idealo/image-super-resolution
-
PyPi (
📥 4.5K / month ·📦 5 ·⏱️ 08.01.2020):pip install ISR
-
Docker Hub (
📥 210 ·⏱️ 01.04.2019):docker pull idealo/image-super-resolution-gpu
Classy Vision (🥉 23 · ⭐ 1.4K) - An end-to-end PyTorch framework for image and video.. MIT
Norfair (🥉 22 · ⭐ 1.4K) - Lightweight Python library for adding real-time object tracking to any.. BSD-3
image-match (🥉 21 · ⭐ 2.7K · 💤 ) - Quickly search over billions of images. Apache-2
DE⫶TR (🥉 19 · ⭐ 8.8K) - End-to-End Object Detection with Transformers. Apache-2
-
GitHub (
👨💻 25 ·🔀 1.6K ·📋 430 - 36% open ·⏱️ 07.03.2022):git clone https://github.com/facebookresearch/detr
PySlowFast (🥉 18 · ⭐ 4.8K) - PySlowFast: video understanding codebase from FAIR for.. Apache-2
scenic (🥉 18 · ⭐ 890) - Scenic: A Jax Library for Computer Vision Research and Beyond. Apache-2
-
GitHub (
👨💻 36 ·🔀 110 ·📦 16 ·📋 37 - 45% open ·⏱️ 04.05.2022):git clone https://github.com/google-research/scenic
Caer (🥉 17 · ⭐ 610 · 💤 ) - A lightweight Computer Vision library. Scale your models, not boilerplate. MIT
Show 12 hidden projects...
- glfw (
🥇 36 ·⭐ 9K) - A multi-platform library for OpenGL, OpenGL ES, Vulkan, window and input.❗️Zlib
- imgaug (
🥈 35 ·⭐ 13K ·💀 ) - Image augmentation for machine learning experiments.MIT
- PyTorch3D (
🥈 29 ·⭐ 5.9K) - PyTorch3D is FAIRs library of reusable components for..❗Unlicensed
- Pillow-SIMD (
🥉 28 ·⭐ 1.8K) - The friendly PIL fork.❗️PIL
- chainercv (
🥉 27 ·⭐ 1.5K ·💀 ) - ChainerCV: a Library for Deep Learning in Computer Vision.MIT
- segmentation_models (
🥉 24 ·⭐ 3.8K ·💀 ) - Segmentation models with pretrained backbones. Keras..MIT
- Image Deduplicator (
🥉 22 ·⭐ 4K ·💀 ) - Finding duplicate images made easy!.Apache-2
- Luminoth (
🥉 22 ·⭐ 2.4K ·💀 ) - Deep Learning toolkit for Computer Vision.BSD-3
- nude.py (
🥉 21 ·⭐ 850 ·💀 ) - Nudity detection with Python.MIT
- solt (
🥉 16 ·⭐ 250 ·💀 ) - Streaming over lightweight data transformations.MIT
- HugsVision (
🥉 14 ·⭐ 160) - HugsVision is a easy to use huggingface wrapper for state-of-the-..MIT
huggingface
- Torch Points 3D (
🥉 14 ·⭐ 51 ·🐣 ) - Pytorch framework for doing deep learning on point..BSD-3
Graph Data
Libraries for graph processing, clustering, embedding, and machine learning tasks.
PyTorch Geometric (🥇 36 · ⭐ 15K) - Graph Neural Network Library for PyTorch. MIT
dgl (🥇 36 · ⭐ 9.5K) - Python package built to ease deep learning on graph, on top of existing.. Apache-2
StellarGraph (🥈 28 · ⭐ 2.4K · 💤 ) - StellarGraph - Machine Learning on Graphs. Apache-2
ogb (🥈 28 · ⭐ 1.3K) - Benchmark datasets, data loaders, and evaluators for graph machine learning. MIT
Paddle Graph Learning (🥈 27 · ⭐ 1.3K) - Paddle Graph Learning (PGL) is an efficient and.. Apache-2
pygraphistry (🥈 26 · ⭐ 1.6K) - PyGraphistry is a Python library to quickly load, shape,.. BSD-3
PyTorch-BigGraph (🥈 25 · ⭐ 3.1K) - Generate embeddings from large-scale graph-structured.. BSD-3
pytorch_geometric_temporal (🥈 25 · ⭐ 1.5K) - PyTorch Geometric Temporal: Spatiotemporal Signal.. MIT
AmpliGraph (🥈 23 · ⭐ 1.7K · 💤 ) - Python library for Representation Learning on Knowledge.. Apache-2
torch-cluster (🥉 22 · ⭐ 510) - PyTorch Extension Library of Optimized Graph Cluster.. MIT
Show 15 hidden projects...
- igraph (
🥇 31 ·⭐ 960) - Python interface for igraph.❗️GPL-2.0
- pygal (
🥈 28 ·⭐ 2.5K) - PYthon svg GrAph plotting Library.❗️LGPL-3.0
- Karate Club (
🥈 23 ·⭐ 1.6K) - Karate Club: An API Oriented Open-source Python Framework for..❗️GPL-3.0
- DeepWalk (
🥉 21 ·⭐ 2.4K ·💀 ) - DeepWalk - Deep Learning for Graphs.❗️GPL-3.0
- DIG (
🥉 21 ·⭐ 1.1K) - A library for graph deep learning research.❗️GPL-3.0
- graph-nets (
🥉 20 ·⭐ 5.1K ·💀 ) - Build Graph Nets in Tensorflow.Apache-2
- DeepGraph (
🥉 17 ·⭐ 250 ·💤 ) - Analyze Data with Pandas-based Networks. Documentation:.BSD-3
- pyRDF2Vec (
🥉 17 ·⭐ 150) - Python Implementation and Extension of RDF2Vec.MIT
- GraphEmbedding (
🥉 16 ·⭐ 2.7K ·💀 ) - Implementation and experiments of graph embedding..MIT
- Sematch (
🥉 16 ·⭐ 380 ·💀 ) - semantic similarity framework for knowledge graph.Apache-2
- OpenKE (
🥉 15 ·⭐ 3K ·💀 ) - An Open-Source Package for Knowledge Embedding (KE).MIT
- Euler (
🥉 15 ·⭐ 2.8K ·💀 ) - A distributed graph deep learning framework.Apache-2
- GraphSAGE (
🥉 15 ·⭐ 2.7K ·💀 ) - Representation learning on large graphs using stochastic..MIT
- OpenNE (
🥉 15 ·⭐ 1.6K ·💀 ) - An Open-Source Package for Network Embedding (NE).MIT
- GraphVite (
🥉 12 ·⭐ 1K ·💀 ) - GraphVite: A General and High-performance Graph Embedding System.Apache-2
Audio Data
Libraries for audio analysis, manipulation, transformation, and extraction, as well as speech recognition and music generation tasks.
DeepSpeech (🥇 34 · ⭐ 19K) - DeepSpeech is an open source embedded (offline, on-device).. MPL-2.0
torchaudio (🥈 33 · ⭐ 1.7K) - Data manipulation and transformation for audio signal.. BSD-2
speechbrain (🥈 32 · ⭐ 4K) - A PyTorch-based Speech Toolkit. Apache-2
SpeechRecognition (🥈 31 · ⭐ 6.2K) - Speech recognition module for Python, supporting several.. BSD-3
pyAudioAnalysis (🥈 30 · ⭐ 4.7K) - Python Audio Analysis Library: Feature Extraction,.. Apache-2
tinytag (🥈 28 · ⭐ 520) - Read audio and music meta data and duration of MP3, OGG, OPUS, MP4, M4A,.. MIT
audioread (🥈 28 · ⭐ 400) - cross-library (GStreamer + Core Audio + MAD + FFmpeg) audio decoding.. MIT
audiomentations (🥉 25 · ⭐ 960) - A Python library for audio data augmentation. Inspired by.. MIT
python-soundfile (🥉 24 · ⭐ 450) - SoundFile is an audio library based on libsndfile, CFFI, and.. BSD-3
Show 8 hidden projects...
- aubio (
🥈 28 ·⭐ 2.7K) - a library for audio and music analysis.❗️GPL-3.0
- Essentia (
🥈 28 ·⭐ 2.1K) - C++ library for audio and music analysis, description and..❗️AGPL-3.0
- python_speech_features (
🥉 24 ·⭐ 2.1K ·💀 ) - This library provides common speech features for ASR..MIT
- TTS (
🥉 22 ·⭐ 5.9K ·💀 ) - Deep learning for Text to Speech (Discussion forum:..MPL-2.0
- Dejavu (
🥉 22 ·⭐ 5.7K ·💀 ) - Audio fingerprinting and recognition in Python.MIT
- TimeSide (
🥉 22 ·⭐ 320) - Scalable audio processing framework written in Python with a..❗️AGPL-3.0
- Muda (
🥉 18 ·⭐ 210 ·💤 ) - A library for augmenting annotated audio data.ISC
- Julius (
🥉 17 ·⭐ 260) - Fast PyTorch based DSP for audio and 1D signals.MIT
Geospatial Data
Libraries to load, process, analyze, and write geographic data as well as libraries for spatial analysis, map visualization, and geocoding.
pydeck (🥇 42 · ⭐ 9.8K) - WebGL2 powered visualization framework. MIT
-
GitHub (
👨💻 190 ·🔀 1.8K ·📦 4K ·📋 2.4K - 6% open ·⏱️ 05.05.2022):git clone https://github.com/visgl/deck.gl
-
PyPi (
📥 990K / month ·📦 23 ·⏱️ 25.10.2021):pip install pydeck
-
Conda (
📥 94K ·⏱️ 26.10.2021):conda install -c conda-forge pydeck
-
npm (
📥 290K / month ·📦 380 ·⏱️ 22.04.2022):npm install deck.gl
ipyleaflet (🥈 33 · ⭐ 1.3K) - A Jupyter - Leaflet.js bridge. MIT
-
GitHub (
👨💻 78 ·🔀 330 ·📦 1.7K ·📋 490 - 38% open ·⏱️ 02.05.2022):git clone https://github.com/jupyter-widgets/ipyleaflet
-
PyPi (
📥 91K / month ·📦 110 ·⏱️ 14.04.2022):pip install ipyleaflet
-
Conda (
📥 810K ·⏱️ 14.04.2022):conda install -c conda-forge ipyleaflet
-
npm (
📥 48K / month ·📦 2 ·⏱️ 14.04.2022):npm install jupyter-leaflet
ArcGIS API (🥉 29 · ⭐ 1.3K) - Documentation and samples for ArcGIS API for Python. Apache-2
-
GitHub (
👨💻 78 ·🔀 880 ·📥 3.1K ·📋 500 - 24% open ·⏱️ 27.04.2022):git clone https://github.com/Esri/arcgis-python-api
-
PyPi (
📥 55K / month ·📦 22 ·⏱️ 03.02.2022):pip install arcgis
-
Docker Hub (
📥 6.9K ·⭐ 33 ·⏱️ 04.02.2022):docker pull esridocker/arcgis-api-python-notebook
EarthPy (🥉 26 · ⭐ 350) - A package built to support working with spatial data using open source.. BSD-3
Show 8 hidden projects...
- Geocoder (
🥉 31 ·⭐ 1.4K ·💀 ) - Python Geocoder.MIT
- Satpy (
🥉 30 ·⭐ 820) - Python package for earth-observing satellite data processing.❗️GPL-3.0
- Sentinelsat (
🥉 27 ·⭐ 750) - Search and download Copernicus Sentinel satellite images.❗️GPL-3.0
- gmaps (
🥉 24 ·⭐ 740 ·💀 ) - Google maps for Jupyter notebooks.BSD-3
- Mapbox GL (
🥉 23 ·⭐ 600 ·💀 ) - Use Mapbox GL JS to visualize data in a Python Jupyter notebook.MIT
- pymap3d (
🥉 22 ·⭐ 240) - pure-Python (Numpy optional) 3D coordinate conversions for geospace ecef..BSD-2
- geoplotlib (
🥉 21 ·⭐ 960 ·💀 ) - python toolbox for visualizing geographical data and making maps.MIT
- prettymaps (
🥉 17 ·⭐ 8K) - A small set of Python functions to draw pretty maps from..❗️AGPL-3.0
Financial Data
Libraries for algorithmic stock/crypto trading, risk analytics, backtesting, technical analysis, and other tasks on financial data.
TensorTrade (🥈 27 · ⭐ 3.8K) - An open source reinforcement learning framework for training,.. Apache-2
Alpha Vantage (🥈 27 · ⭐ 3.6K · 💤 ) - A python wrapper for Alpha Vantage API for financial data. MIT
Enigma Catalyst (🥉 25 · ⭐ 2.3K · 💤 ) - An Algorithmic Trading Library for Crypto-Assets in.. Apache-2
stockstats (🥉 24 · ⭐ 990) - Supply a wrapper ``StockDataFrame`` based on the.. BSD-3
Crypto Signals (🥉 22 · ⭐ 4K · 💤 ) - Github.com/CryptoSignal - #1 Quant Trading & Technical.. MIT
-
GitHub (
👨💻 28 ·🔀 1K ·📋 260 - 20% open ·⏱️ 28.06.2021):git clone https://github.com/CryptoSignal/crypto-signal
-
Docker Hub (
📥 140K ·⭐ 7 ·⏱️ 03.09.2020):docker pull shadowreaver/crypto-signal
tf-quant-finance (🥉 22 · ⭐ 3.1K) - High-performance TensorFlow library for quantitative.. Apache-2
finmarketpy (🥉 19 · ⭐ 2.9K) - Python library for backtesting trading strategies & analyzing.. Apache-2
Show 12 hidden projects...
- zipline (
🥇 32 ·⭐ 15K ·💀 ) - Zipline, a Pythonic Algorithmic Trading Library.Apache-2
- backtrader (
🥈 29 ·⭐ 8.7K ·💤 ) - Python Backtesting library for trading strategies.❗️GPL-3.0
- pyfolio (
🥈 29 ·⭐ 4.4K ·💀 ) - Portfolio and risk analytics in Python.Apache-2
- arch (
🥈 29 ·⭐ 910) - ARCH models in Python.❗️NCSA
- Alphalens (
🥉 26 ·⭐ 2.3K ·💀 ) - Performance analysis of predictive (alpha) stock factors.Apache-2
- empyrical (
🥉 26 ·⭐ 920 ·💀 ) - Common financial risk and performance metrics. Used by zipline..Apache-2
- PyAlgoTrade (
🥉 24 ·⭐ 3.7K ·💀 ) - Python Algorithmic Trading Library.Apache-2
- FinTA (
🥉 23 ·⭐ 1.6K ·💤 ) - Common financial technical indicators implemented in Pandas.❗️LGPL-3.0
- Backtesting.py (
🥉 19 ·⭐ 2.4K) - Backtest trading strategies in Python.❗️AGPL-3.0
- FinQuant (
🥉 18 ·⭐ 760 ·💀 ) - A program for financial portfolio management, analysis and..MIT
- surpriver (
🥉 12 ·⭐ 1.4K ·💀 ) - Find big moving stocks before they move using machine..❗️GPL-3.0
- pyrtfolio (
🥉 7 ·⭐ 110 ·💀 ) - Python package to generate stock portfolios.❗️GPL-3.0
Time Series Data
Libraries for forecasting, anomaly detection, feature extraction, and machine learning on time-series and sequential data.
Prophet (🥇 33 · ⭐ 14K) - Tool for producing high quality forecasts for time series data that has.. MIT
pmdarima (🥇 32 · ⭐ 1.2K) - A statistical library designed to fill the void in Pythons time series.. MIT
Darts (🥈 30 · ⭐ 4K) - A python library for easy manipulation and forecasting of time series. Apache-2
-
GitHub (
👨💻 53 ·🔀 400 ·📦 53 ·📋 440 - 36% open ·⏱️ 05.05.2022):git clone https://github.com/unit8co/darts
-
PyPi (
📥 8.6K / month ·📦 2 ·⏱️ 13.04.2022):pip install u8darts
-
Conda (
📥 3.6K ·⏱️ 14.04.2022):conda install -c conda-forge u8darts-all
-
Docker Hub (
📥 330 ·⏱️ 13.04.2022):docker pull unit8/darts
NeuralProphet (🥈 30 · ⭐ 2.2K · ➕ ) - NeuralProphet: A simple forecasting package. MIT
STUMPY (🥈 29 · ⭐ 2.3K) - STUMPY is a powerful and scalable Python library for modern time series.. BSD-3
pytorch-forecasting (🥈 29 · ⭐ 1.9K) - Time series forecasting with PyTorch. MIT
StatsForecast (🥈 25 · ⭐ 560 · 🐣 ) - Lightning fast forecasting with statistical and econometric.. MIT
Show 9 hidden projects...
- PyFlux (
🥉 24 ·⭐ 2K ·💀 ) - Open source time series library for Python.BSD-3
- luminol (
🥉 21 ·⭐ 1K ·💀 ) - Anomaly Detection and Correlation library.Apache-2
- NeuralForecast (
🥉 21 ·⭐ 580) - Scalable and user friendly neural forecasting algorithms..❗️GPL-3.0
- seglearn (
🥉 21 ·⭐ 510 ·💀 ) - Python module for machine learning time series:.BSD-3
- tick (
🥉 21 ·⭐ 380 ·💀 ) - Module for statistical learning, with a particular emphasis on time-..BSD-3
- pydlm (
🥉 20 ·⭐ 420 ·💀 ) - A python library for Bayesian time series modeling.BSD-3
- ADTK (
🥉 19 ·⭐ 810 ·💀 ) - A Python toolkit for rule-based/unsupervised anomaly detection in time..MPL-2.0
- matrixprofile-ts (
🥉 19 ·⭐ 680 ·💀 ) - A Python library for detecting patterns and anomalies..Apache-2
- tsaug (
🥉 14 ·⭐ 240 ·💀 ) - A Python package for time series augmentation.Apache-2
Medical Data
Libraries for processing and analyzing medical data such as MRIs, EEGs, genomic data, and other medical imaging formats.
MNE (🥇 36 · ⭐ 1.9K · 📉 ) - MNE: Magnetoencephalography (MEG) and Electroencephalography (EEG) in.. BSD-3
DeepVariant (🥉 25 · ⭐ 2.5K) - DeepVariant is an analysis pipeline that uses a deep neural.. BSD-3
Medical Detection Toolkit (🥉 14 · ⭐ 1.1K) - The Medical Detection Toolkit contains 2D + 3D.. Apache-2
-
GitHub (
👨💻 3 ·🔀 280 ·📋 120 - 32% open ·⏱️ 04.04.2022):git clone https://github.com/MIC-DKFZ/medicaldetectiontoolkit
Show 9 hidden projects...
- NiftyNet (
🥉 25 ·⭐ 1.3K ·💀 ) - [unmaintained] An open-source convolutional neural..Apache-2
- NIPY (
🥉 23 ·⭐ 320 ·💀 ) - Neuroimaging in Python FMRI analysis package.BSD-3
- DLTK (
🥉 22 ·⭐ 1.3K ·💀 ) - Deep Learning Toolkit for Medical Image Analysis.Apache-2
- MedPy (
🥉 22 ·⭐ 400 ·💀 ) - Medical image processing in Python.❗️GPL-3.0
- Glow (
🥉 22 ·⭐ 200) - An open-source toolkit for large-scale genomic analysis.Apache-2
- Brainiak (
🥉 21 ·⭐ 270 ·💤 ) - Brain Imaging Analysis Kit.Apache-2
- MedicalTorch (
🥉 17 ·⭐ 770 ·💀 ) - A medical imaging framework for Pytorch.Apache-2
- DeepNeuro (
🥉 13 ·⭐ 110 ·💀 ) - A deep learning python package for neuroimaging data. Made by:.MIT
- MedicalNet (
🥉 12 ·⭐ 1.4K ·💀 ) - Many studies have shown that the performance on deep learning is..MIT
Tabular Data
Libraries for processing tabular and structured data.
carefree-learn (🥈 18 · ⭐ 360) - Deep Learning PyTorch. MIT
pytorch_tabular (🥉 17 · ⭐ 590) - A standard framework for modelling Deep Learning Models.. MIT
Show 2 hidden projects...
- miceforest (
🥇 23 ·⭐ 150) - Multiple Imputation with Random Forests in Python.MIT
- upgini (
🥉 16 ·⭐ 23 ·🐣 ) - Features search library for supervised machine learning searches..BSD-3
Optical Character Recognition
Libraries for optical character recognition (OCR) and text extraction from images or videos.
EasyOCR (🥇 35 · ⭐ 15K) - Ready-to-use OCR with 80+ supported languages and all popular writing.. Apache-2
Tesseract (🥈 33 · ⭐ 4.2K) - Python-tesseract is an optical character recognition (OCR) tool.. Apache-2
OCRmyPDF (🥈 29 · ⭐ 6.3K) - OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them.. MPL-2.0
attention-ocr (🥉 21 · ⭐ 910 · 💤 ) - A Tensorflow model for text recognition (CNN + seq2seq.. MIT
pdftabextract (🥉 19 · ⭐ 2K) - A set of tools for extracting tables from PDF files helping to.. Apache-2
Mozart (🥉 10 · ⭐ 370 · 💤 ) - An optical music recognition (OMR) system. Converts sheet.. Apache-2
-
GitHub (
👨💻 5 ·🔀 56 ·📋 11 - 27% open ·⏱️ 05.05.2021):git clone https://github.com/aashrafh/Mozart
Show 1 hidden projects...
Data Containers & Structures
General-purpose data containers & structures as well as utilities & extensions for pandas.
Data Loading & Extraction
Libraries for loading, collecting, and extracting data from a variety of data sources and formats.
Web Scraping & Crawling
Libraries for web scraping, crawling, downloading, and mining as well as libraries.
Data Pipelines & Streaming
Libraries for data batch- and stream-processing, workflow automation, job scheduling, and other data pipeline tasks.
Celery (🥇 46 · ⭐ 19K) - Asynchronous task queue/job queue based on distributed message passing. BSD-3
Airflow (🥇 45 · ⭐ 26K) - Platform to programmatically author, schedule, and monitor workflows. Apache-2
-
GitHub (
👨💻 2.4K ·🔀 10K ·📥 280K ·📋 5.5K - 16% open ·⏱️ 05.05.2022):git clone https://github.com/apache/airflow
-
PyPi (
📥 5.6M / month ·📦 460 ·⏱️ 30.04.2022):pip install apache-airflow
-
Conda (
📥 590K ·⏱️ 02.05.2022):conda install -c conda-forge airflow
-
Docker Hub (
📥 71M ·⭐ 330 ·⏱️ 30.04.2022):docker pull apache/airflow
luigi (🥇 37 · ⭐ 16K · 📉 ) - Luigi is a Python module that helps you build complex pipelines of.. Apache-2
Great Expectations (🥈 36 · ⭐ 6.5K) - Always know what to expect from your data. Apache-2
dbt (🥈 36 · ⭐ 4.7K) - dbt enables data analysts and engineers to transform their data using the.. Apache-2
Kedro (🥈 35 · ⭐ 7.2K) - A Python framework for creating reproducible, maintainable and modular.. Apache-2
Activeloop (🥈 31 · ⭐ 4.5K) - Dataset format for AI. Build, manage, query & visualize datasets.. MPL-2.0
PyFunctional (🥉 26 · ⭐ 2K) - Python library for creating data pipelines with chain functional.. MIT
streamparse (🥉 26 · ⭐ 1.5K) - Run Python in Apache Storm topologies. Pythonic API, CLI.. Apache-2
whylogs (🥉 24 · ⭐ 1K) - Open standard for end-to-end data and ML monitoring for any scale in.. Apache-2
spark-deep-learning (🥉 19 · ⭐ 1.9K) - Deep Learning Pipelines for Apache Spark. Apache-2
-
GitHub (
👨💻 17 ·🔀 460 ·📦 21 ·📋 100 - 74% open ·⏱️ 21.03.2022):git clone https://github.com/databricks/spark-deep-learning
Databolt Flow (🥉 18 · ⭐ 940 · 💤 ) - Python library for building highly effective data science.. MIT
Mara Pipelines (🥉 17 · ⭐ 1.9K) - A lightweight opinionated ETL framework, halfway between plain.. MIT
Show 12 hidden projects...
- mrjob (
🥈 32 ·⭐ 2.6K ·💀 ) - Run MapReduce jobs on Hadoop or Amazon Web Services.Apache-2
- faust (
🥈 30 ·⭐ 6.1K ·💀 ) - Python Stream Processing.BSD-3
- dbnd (
🥉 26 ·⭐ 220) - DBND is an agile pipeline framework that helps data engineering teams..Apache-2
- bonobo (
🥉 24 ·⭐ 1.5K ·💀 ) - Extract Transform Load for Python 3.5+.Apache-2
- dpark (
🥉 22 ·⭐ 2.7K ·💀 ) - Python clone of Spark, a MapReduce alike framework in Python.BSD-3
- pysparkling (
🥉 22 ·⭐ 250 ·💀 ) - A pure Python implementation of Apache Sparks RDD and DStream..MIT
- BatchFlow (
🥉 21 ·⭐ 180) - BatchFlow helps you conveniently work with random or sequential..Apache-2
- mrq (
🥉 20 ·⭐ 860 ·💀 ) - Mr. Queue - A distributed worker task queue in Python using Redis & gevent.MIT
- bodywork-core (
🥉 20 ·⭐ 330) - ML pipeline orchestration and model deployments on..❗️AGPL-3.0
- flupy (
🥉 16 ·⭐ 170) - Fluent data pipelines for python and your shell.MIT
- Botflow (
🥉 15 ·⭐ 1.2K ·💀 ) - Python Fast Dataflow programming framework for Data pipeline work(..BSD-3
- datajob (
🥉 13 ·⭐ 88) - Build and deploy a serverless data pipeline on AWS with no effort.Apache-2
Distributed Machine Learning
Libraries that provide capabilities to distribute and parallelize machine learning tasks across large-scale compute infrastructure.
dask.distributed (🥇 41 · ⭐ 1.3K) - A distributed task scheduler for Dask. BSD-3
horovod (🥇 36 · ⭐ 12K) - Distributed training framework for TensorFlow, Keras, PyTorch, and.. Apache-2
H2O-3 (🥈 33 · ⭐ 5.8K) - H2O is an Open Source, Distributed, Fast & Scalable Machine Learning.. Apache-2
BigDL (🥈 33 · ⭐ 3.9K) - Building Large-Scale AI Applications for Distributed Big Data. Apache-2
-
GitHub (
👨💻 140 ·🔀 990 ·📦 36 ·📋 1.3K - 34% open ·⏱️ 05.05.2022):git clone https://github.com/intel-analytics/BigDL
-
PyPi (
📥 12K / month ·📦 1 ·⏱️ 05.05.2022):pip install bigdl
-
Maven (
📦 4 ·⏱️ 20.04.2021):<dependency> <groupId>com.intel.analytics.bigdl</groupId> <artifactId>bigdl-SPARK_2.4</artifactId> <version>[VERSION]</version> </dependency>
ipyparallel (🥈 33 · ⭐ 2.2K) - IPython Parallel: Interactive Parallel Computing in Python. BSD-3
DeepSpeed (🥈 32 · ⭐ 6.7K) - DeepSpeed is a deep learning optimization library that makes.. MIT
-
GitHub (
👨💻 100 ·🔀 780 ·📦 240 ·📋 900 - 52% open ·⏱️ 03.05.2022):git clone https://github.com/microsoft/DeepSpeed
-
PyPi (
📥 180K / month ·📦 10 ·⏱️ 27.04.2022):pip install deepspeed
-
Docker Hub (
📥 14K ·⭐ 3 ·⏱️ 09.03.2022):docker pull deepspeed/deepspeed
TensorFlowOnSpark (🥈 28 · ⭐ 3.8K) - TensorFlowOnSpark brings TensorFlow programs to.. Apache-2
petastorm (🥈 28 · ⭐ 1.4K) - Petastorm library enables single machine or distributed training.. Apache-2
analytics-zoo (🥉 27 · ⭐ 2.5K) - Distributed Tensorflow, Keras and PyTorch on Apache.. Apache-2
Hivemind (🥉 21 · ⭐ 1K) - Decentralized deep learning in PyTorch. Built to train models on thousands.. MIT
Apache Singa (🥉 20 · ⭐ 2.6K · 💤 ) - a distributed deep learning platform. Apache-2
-
GitHub (
👨💻 76 ·🔀 790 ·📦 1 ·📋 97 - 41% open ·⏱️ 10.08.2021):git clone https://github.com/apache/singa
-
Conda (
📥 450 ·⏱️ 09.08.2021):conda install -c nusdbsystem singa
-
Docker Hub (
📥 280 ·⭐ 4 ·⏱️ 04.06.2019):docker pull apache/singa
BytePS (🥉 19 · ⭐ 3.2K) - A high performance and generic framework for distributed DNN training. Apache-2
-
GitHub (
👨💻 19 ·🔀 440 ·📋 260 - 38% open ·⏱️ 10.02.2022):git clone https://github.com/bytedance/byteps
-
PyPi (
📥 45 / month ·⏱️ 02.08.2021):pip install byteps
-
Docker Hub (
📥 1.3K ·⏱️ 03.03.2020):docker pull bytepsimage/tensorflow
mesh-transformer-jax (🥉 17 · ⭐ 4.1K) - Model parallel transformers in JAX and Haiku. Apache-2
-
GitHub (
👨💻 23 ·🔀 510 ·📋 170 - 10% open ·⏱️ 28.01.2022):git clone https://github.com/kingoflolz/mesh-transformer-jax
parallelformers (🥉 16 · ⭐ 460) - Parallelformers: An Efficient Model Parallelization.. Apache-2
Show 8 hidden projects...
- DEAP (
🥈 31 ·⭐ 4.7K) - Distributed Evolutionary Algorithms in Python.❗️LGPL-3.0
- launchpad (
🥉 22 ·⭐ 270) - Launchpad is a library that simplifies writing distributed..Apache-2
- TensorFrames (
🥉 20 ·⭐ 760 ·💀 ) - [DEPRECATED] Tensorflow wrapper for DataFrames on..Apache-2
- sk-dist (
🥉 19 ·⭐ 280 ·💤 ) - Distributed scikit-learn meta-estimators in PySpark.Apache-2
- somoclu (
🥉 19 ·⭐ 240 ·💤 ) - Massively parallel self-organizing maps: accelerate training on..MIT
- Fiber (
🥉 18 ·⭐ 970 ·💀 ) - Distributed Computing for AI Made Simple.Apache-2
- LazyCluster (
🥉 14 ·⭐ 43 ·💤 ) - Distributed machine learning made simple.Apache-2
- autodist (
🥉 11 ·⭐ 120 ·💀 ) - Simple Distributed Deep Learning on TensorFlow.Apache-2
Hyperparameter Optimization & AutoML
Libraries for hyperparameter optimization, automl and neural architecture search.
auto-sklearn (🥇 33 · ⭐ 6.2K) - Automated Machine Learning with scikit-learn. BSD-3
Keras Tuner (🥇 33 · ⭐ 2.5K) - Hyperparameter tuning for humans. Apache-2
featuretools (🥈 32 · ⭐ 6.1K) - An open source python library for automated feature engineering. BSD-3
scikit-optimize (🥈 31 · ⭐ 2.3K · 💤 ) - Sequential model-based optimization with a.. BSD-3
mljar-supervised (🥈 28 · ⭐ 1.9K) - Python package for AutoML on Tabular Data with Feature.. MIT
Neuraxle (🥉 24 · ⭐ 520) - The worlds cleanest AutoML framework - Do hyperparameter tuning with.. Apache-2
HpBandSter (🥉 22 · ⭐ 540) - a distributed Hyperband implementation on Steroids. BSD-3
Hyperactive (🥉 22 · ⭐ 380) - An optimization and data collection toolbox for convenient and fast.. MIT
lazypredict (🥉 22 · ⭐ 350) - Lazy Predict help build a lot of basic models without much code.. MIT
Auto ViML (🥉 21 · ⭐ 340) - Automatically Build Multiple ML Models with a Single Line of Code... Apache-2
sklearn-deap (🥉 20 · ⭐ 690 · 💤 ) - Use evolutionary algorithms instead of gridsearch in.. MIT
AlphaPy (🥉 19 · ⭐ 770) - Automated Machine Learning [AutoML] with Python, scikit-learn, Keras,.. Apache-2
model_search (🥉 10 · ⭐ 3.2K) - AutoML algorithms for model architecture search at scale. Apache-2
-
GitHub (
👨💻 1 ·🔀 360 ·📋 50 - 70% open ·⏱️ 09.02.2022):git clone https://github.com/google/model_search
Show 23 hidden projects...
- TPOT (
🥈 32 ·⭐ 8.6K ·💀 ) - A Python Automated Machine Learning tool that optimizes..❗️LGPL-3.0
- Bayesian Optimization (
🥈 31 ·⭐ 5.9K ·💀 ) - A Python implementation of global optimization with..MIT
- Orion (
🥈 27 ·⭐ 230) - Asynchronous Distributed Hyperparameter Optimization.BSD-3
- GPyOpt (
🥈 26 ·⭐ 810 ·💀 ) - Gaussian Process Optimization using GPy.BSD-3
- SMAC3 (
🥈 26 ·⭐ 690) - Sequential Model-based Algorithm Configuration.❗️BSD-1-Clause
- auto_ml (
🥉 23 ·⭐ 1.6K ·💀 ) - [UNMAINTAINED] Automated machine learning for analytics & production.MIT
- featurewiz (
🥉 23 ·⭐ 230) - Use advanced feature engineering strategies and select best..Apache-2
- MLBox (
🥉 22 ·⭐ 1.3K ·💀 ) - MLBox is a powerful Automated Machine Learning python library.❗️BSD-1-Clause
- optunity (
🥉 22 ·⭐ 380 ·💀 ) - optimization routines for hyperparameter tuning.BSD-3
- Test Tube (
🥉 20 ·⭐ 710 ·💀 ) - Python library to easily log experiments and parallelize..MIT
- Dragonfly (
🥉 19 ·⭐ 640 ·💀 ) - An open source python library for scalable Bayesian optimisation.MIT
- Auto Tune Models (
🥉 18 ·⭐ 520 ·💀 ) - Auto Tune Models - A multi-tenant, multi-data system for..MIT
- Sherpa (
🥉 18 ·⭐ 310 ·💀 ) - Hyperparameter optimization that enables researchers to..❗️GPL-3.0
- Advisor (
🥉 17 ·⭐ 1.4K ·💀 ) - Open-source implementation of Google Vizier for hyper parameters..Apache-2
- automl-gs (
🥉 16 ·⭐ 1.8K ·💀 ) - Provide an input CSV and a target field to predict, generate a..MIT
- Xcessiv (
🥉 16 ·⭐ 1.3K ·💀 ) - A web-based application for quick, scalable, and automated..Apache-2
- HyperparameterHunter (
🥉 16 ·⭐ 690 ·💀 ) - Easy hyperparameter optimization and automatic result..MIT
- Parfit (
🥉 16 ·⭐ 200 ·💀 ) - A package for parallelizing the fit and flexibly scoring of..MIT
- ENAS (
🥉 13 ·⭐ 2.5K ·💀 ) - PyTorch implementation of Efficient Neural Architecture Search via..Apache-2
- Auptimizer (
🥉 13 ·⭐ 180 ·💀 ) - An automatic ML model optimization tool.❗️GPL-3.0
- Devol (
🥉 11 ·⭐ 940 ·💀 ) - Genetic neural architecture search with Keras.MIT
- Hypermax (
🥉 11 ·⭐ 100 ·💀 ) - Better, faster hyper-parameter optimization.BSD-3
- Hypertunity (
🥉 9 ·⭐ 120 ·💀 ) - A toolset for black-box hyperparameter optimisation.Apache-2
Reinforcement Learning
Libraries for building and evaluating reinforcement learning & agent-based systems.
OpenAI Gym (🥇 41 · ⭐ 27K) - A toolkit for developing and comparing reinforcement learning.. MIT
FinRL (🥇 30 · ⭐ 4.8K) - FinRL: The first open-source project for financial reinforcement learning... MIT
TensorLayer (🥈 28 · ⭐ 7K) - Deep Learning and Reinforcement Learning Library for.. Apache-2
PARL (🥉 27 · ⭐ 2.6K) - A high-performance distributed training framework for Reinforcement.. Apache-2
Stable Baselines (🥉 25 · ⭐ 3.5K · 💤 ) - A fork of OpenAI Baselines, implementations of.. MIT
TensorForce (🥉 24 · ⭐ 3.1K) - Tensorforce: a TensorFlow library for applied.. Apache-2
rliable (🥉 11 · ⭐ 400) - [NeurIPS21 Outstanding Paper] Library for reliable evaluation on RL.. Apache-2
Show 6 hidden projects...
- baselines (
🥈 29 ·⭐ 13K ·💀 ) - OpenAI Baselines: high-quality implementations of reinforcement..MIT
- keras-rl (
🥈 28 ·⭐ 5.3K ·💀 ) - Deep Reinforcement Learning for Keras.MIT
- ChainerRL (
🥉 23 ·⭐ 1K ·💀 ) - ChainerRL is a deep reinforcement learning library built on top of..MIT
- DeepMind Lab (
🥉 19 ·⭐ 6.7K) - A customisable 3D platform for agent-based AI research.❗️GPL-2.0
- SerpentAI (
🥉 18 ·⭐ 6.2K ·💀 ) - Game Agent Framework. Helping you create AIs / Bots that learn to..MIT
- Maze (
🥉 13 ·⭐ 210) - Maze Applied Reinforcement Learning Framework.❗️Custom
Recommender Systems
Libraries for building and evaluating recommendation systems.
Recommenders (🥇 34 · ⭐ 13K) - Best Practices on Recommendation Systems. MIT
lightfm (🥈 29 · ⭐ 4K) - A Python implementation of LightFM, a hybrid recommendation algorithm. Apache-2
TF Recommenders (🥈 28 · ⭐ 1.3K) - TensorFlow Recommenders is a library for building.. Apache-2
TF Ranking (🥈 27 · ⭐ 2.5K) - Learning to Rank in TensorFlow. Apache-2
recmetrics (🥉 21 · ⭐ 390) - A library of metrics for evaluating recommender systems. MIT
Case Recommender (🥉 18 · ⭐ 400) - Case Recommender: A Flexible and Extensible Python.. MIT
Show 7 hidden projects...
- scikit-surprise (
🥈 27 ·⭐ 5.4K ·💀 ) - A Python scikit for building and analyzing recommender..BSD-3
- tensorrec (
🥉 22 ·⭐ 1.2K ·💀 ) - A TensorFlow recommendation algorithm and framework in..Apache-2
- lkpy (
🥉 22 ·⭐ 200) - Python recommendation toolkit.MIT
- fastFM (
🥉 21 ·⭐ 980 ·💀 ) - fastFM: A Library for Factorization Machines.BSD-3
- Spotlight (
🥉 18 ·⭐ 2.7K ·💀 ) - Deep recommender models using PyTorch.MIT
- Collie (
🥉 16 ·⭐ 88) - A library for preparing, training, and evaluating scalable deep..BSD-3
- OpenRec (
🥉 15 ·⭐ 390 ·💀 ) - OpenRec is an open-source and modular library for neural network-..Apache-2
Privacy Machine Learning
Libraries for encrypted and privacy-preserving machine learning using methods like federated learning & differential privacy.
TensorFlow Privacy (🥈 26 · ⭐ 1.6K) - Library for training machine learning models with.. Apache-2
TFEncrypted (🥉 25 · ⭐ 1K) - A Framework for Encrypted Machine Learning in TensorFlow. Apache-2
Workflow & Experiment Tracking
Libraries to organize, track, and visualize machine learning experiments.
Tensorboard (🥇 43 · ⭐ 5.8K) - TensorFlows Visualization Toolkit. Apache-2
SageMaker SDK (🥇 37 · ⭐ 1.6K) - A library for training and deploying machine learning.. Apache-2
DVC (🥈 36 · ⭐ 9.7K) - Data Version Control | Git for Data & Models | ML Experiments Management. Apache-2
wandb client (🥈 35 · ⭐ 3.9K) - A tool for visualizing and tracking your machine learning.. MIT
AzureML SDK (🥈 35 · ⭐ 3K) - Python notebooks with ML and deep learning examples with Azure Machine.. MIT
tensorboardX (🥈 33 · ⭐ 7.3K) - tensorboard for pytorch (and chainer, mxnet, numpy, ...). MIT
ClearML (🥈 33 · ⭐ 3.1K) - ClearML - Auto-Magical CI/CD to streamline your ML workflow... Apache-2
-
GitHub (
👨💻 49 ·🔀 430 ·📥 430 ·📦 240 ·📋 520 - 40% open ·⏱️ 05.05.2022):git clone https://github.com/allegroai/clearml
-
PyPi (
📥 110K / month ·📦 6 ·⏱️ 26.04.2022):pip install clearml
-
Docker Hub (
📥 30K ·⏱️ 05.10.2020):docker pull allegroai/trains
ml-metadata (🥉 28 · ⭐ 470) - For recording and retrieving metadata associated with ML.. Apache-2
livelossplot (🥉 26 · ⭐ 1.2K) - Live training loss plot in Jupyter Notebook for Keras,.. MIT
Labml (🥉 24 · ⭐ 1.1K) - Monitor deep learning model training and hardware usage from your mobile.. MIT
Show 17 hidden projects...
- Neptune.ai (
🥈 29 ·⭐ 270) - Experiment tracking tool and model registry.Apache-2
- kaggle (
🥉 28 ·⭐ 4.7K ·💀 ) - Official Kaggle API.Apache-2
- knockknock (
🥉 24 ·⭐ 2.4K ·💀 ) - Knock Knock: Get notified when your training ends with only two..MIT
- TNT (
🥉 22 ·⭐ 1.4K ·💀 ) - Simple tools for logging and visualizing, loading and training.BSD-3
- SKLL (
🥉 22 ·⭐ 530) - SciKit-Learn Laboratory (SKLL) makes it easy to run machine..❗️BSD-1-Clause
- gokart (
🥉 22 ·⭐ 250) - Gokart solves reproducibility, task dependencies, constraints of good code,..MIT
- TensorWatch (
🥉 21 ·⭐ 3.2K ·💀 ) - Debugging, monitoring and visualization for Python Machine..MIT
- hiddenlayer (
🥉 21 ·⭐ 1.6K ·💀 ) - Neural network graphs and training metrics for..MIT
- quinn (
🥉 21 ·⭐ 330 ·💀 ) - pyspark methods to enhance developer productivity.Apache-2
- TensorBoard Logger (
🥉 20 ·⭐ 620 ·💀 ) - Log TensorBoard events without touching TensorFlow.MIT
- MXBoard (
🥉 19 ·⭐ 330 ·💀 ) - Logging MXNet data for visualization in TensorBoard.Apache-2
- datmo (
🥉 17 ·⭐ 340 ·💀 ) - Open source production model management tool for data scientists.MIT
- chitra (
🥉 17 ·⭐ 190) - A multi-functional library for full-stack Deep Learning. Simplifies..Apache-2
- steppy (
🥉 15 ·⭐ 130 ·💀 ) - Lightweight, Python library for fast and reproducible experimentation.MIT
- caliban (
🥉 14 ·⭐ 420 ·💀 ) - Research workflows made easy, locally and in the Cloud.Apache-2
- ModelChimp (
🥉 13 ·⭐ 120 ·💤 ) - Experiment tracking for machine and deep learning projects.BSD-2
- traintool (
🥉 8 ·⭐ 9 ·💀 ) - Train off-the-shelf machine learning models in one..Apache-2
Model Serialization & Deployment
Libraries to serialize models to files, convert between a variety of model formats, and optimize models for deployment.
Core ML Tools (🥇 32 · ⭐ 2.6K) - Core ML tools contain supporting tools for Core ML model.. BSD-3
huggingface_hub (🥈 31 · ⭐ 410) - All the open source things related to the Hugging Face Hub. Apache-2
TorchServe (🥈 30 · ⭐ 2.6K) - Serve, optimize and scale PyTorch models in production. Apache-2
-
GitHub (
👨💻 110 ·🔀 500 ·📥 1.4K ·📋 910 - 16% open ·⏱️ 05.05.2022):git clone https://github.com/pytorch/serve
-
PyPi (
📥 14K / month ·📦 8 ·⏱️ 01.03.2022):pip install torchserve
-
Conda (
📥 22K ·⏱️ 01.03.2022):conda install -c pytorch torchserve
-
Docker Hub (
📥 1M ·⭐ 11 ·⏱️ 01.03.2022):docker pull pytorch/torchserve
Hummingbird (🥈 27 · ⭐ 2.8K) - Hummingbird compiles trained ML models into tensor computation for.. MIT
m2cgen (🥉 26 · ⭐ 2.1K) - Transform ML models into a native code (Java, C, Python, Go, JavaScript,.. MIT
pytorch2keras (🥉 20 · ⭐ 790 · 💤 ) - PyTorch to Keras model convertor. MIT
nebullvm (🥉 17 · ⭐ 1K · 🐣 ) - Easy-to-use library to boost AI inference leveraging multiple DL.. Apache-2
Show 5 hidden projects...
- mmdnn (
🥉 25 ·⭐ 5.6K ·💀 ) - MMdnn is a set of tools to help users inter-operate among different deep..MIT
- Larq Compute Engine (
🥉 21 ·⭐ 190) - Highly optimized inference engine for Binarized..Apache-2
- sklearn-porter (
🥉 19 ·⭐ 1.1K ·💀 ) - Transpile trained scikit-learn estimators to C, Java,..MIT
- tfdeploy (
🥉 16 ·⭐ 350 ·💀 ) - Deploy tensorflow graphs for fast evaluation and export to..BSD-3
- backprop (
🥉 14 ·⭐ 230 ·💤 ) - Backprop makes it simple to use, finetune, and deploy state-of-..Apache-2
Model Interpretability
Libraries to visualize, explain, debug, evaluate, and interpret machine learning models.
shap (🥇 40 · ⭐ 16K) - A game theoretic approach to explain the output of any machine learning model. MIT
Lime (🥇 31 · ⭐ 9.8K · 💤 ) - Lime: Explaining the predictions of any machine learning classifier. BSD-2
InterpretML (🥇 31 · ⭐ 4.7K) - Fit interpretable models. Explain blackbox machine learning. MIT
Model Analysis (🥇 31 · ⭐ 1.2K) - Model analysis tools for TensorFlow. Apache-2
yellowbrick (🥈 30 · ⭐ 3.6K) - Visual analysis and diagnostic tools to facilitate machine.. Apache-2
dtreeviz (🥈 29 · ⭐ 2.1K) - A python library for decision tree visualization and model interpretation. MIT
explainerdashboard (🥈 27 · ⭐ 1.2K) - Quickly build Explainable AI dashboards that show the inner.. MIT
responsible-ai-widgets (🥈 26 · ⭐ 480) - This project provides responsible AI user interfaces.. MIT
Fairness 360 (🥉 24 · ⭐ 1.7K) - A comprehensive set of fairness metrics for datasets and.. Apache-2
Explainability 360 (🥉 24 · ⭐ 1.1K) - Interpretability and explainability of data and.. Apache-2
imodels (🥉 24 · ⭐ 740) - Interpretable ML package for concise, transparent, and accurate predictive.. MIT
LIT (🥉 23 · ⭐ 2.9K) - The Language Interpretability Tool: Interactively analyze NLP models for.. Apache-2
tf-explain (🥉 21 · ⭐ 920) - Interpretability Methods for tf.keras models with Tensorflow 2.x. MIT
What-If Tool (🥉 21 · ⭐ 670) - Source code/webpage/demos for the What-If Tool. Apache-2
-
GitHub (
👨💻 20 ·🔀 130 ·📋 100 - 54% open ·⏱️ 05.01.2022):git clone https://github.com/PAIR-code/what-if-tool
-
PyPi (
📥 7.6K / month ·📦 3 ·⏱️ 12.10.2021):pip install witwidget
-
Conda (
📥 950K ·⏱️ 06.01.2022):conda install -c conda-forge tensorboard-plugin-wit
-
npm (
📥 4.4K / month ·⏱️ 12.10.2021):npm install wit-widget
sklearn-evaluation (🥉 21 · ⭐ 320) - Machine learning model evaluation made easy: plots,.. MIT
iNNvestigate (🥉 20 · ⭐ 970 · 💤 ) - A toolbox to iNNvestigate neural networks predictions!. BSD-2
interpret-text (🥉 15 · ⭐ 310) - A library that incorporates state-of-the-art explainers for.. MIT
Show 16 hidden projects...
- pyLDAvis (
🥈 29 ·⭐ 1.6K ·💀 ) - Python library for interactive topic model visualization...BSD-3
- eli5 (
🥈 28 ·⭐ 2.5K ·💀 ) - A library for debugging/inspecting machine learning classifiers and..MIT
- Lucid (
🥈 27 ·⭐ 4.4K ·💀 ) - A collection of infrastructure and tools for research in..Apache-2
- scikit-plot (
🥈 26 ·⭐ 2.2K ·💀 ) - An intuitive library to add plotting functionality to..MIT
- keras-vis (
🥈 25 ·⭐ 2.9K ·💀 ) - Neural network visualization toolkit for keras.MIT
- DALEX (
🥉 23 ·⭐ 1K) - moDel Agnostic Language for Exploration and eXplanation.❗️GPL-3.0
- TreeInterpreter (
🥉 22 ·⭐ 710 ·💀 ) - Package for interpreting scikit-learns decision tree..BSD-3
- random-forest-importances (
🥉 22 ·⭐ 500 ·💀 ) - Code to compute permutation and drop-column..MIT
- Skater (
🥉 20 ·⭐ 1K) - Python Library for Model Interpretation/Explanations.❗️UPL-1.0
- model-card-toolkit (
🥉 19 ·⭐ 270) - a tool that leverages rich metadata and lineage..Apache-2
- fairness-indicators (
🥉 18 ·⭐ 250) - Tensorflows Fairness Evaluation and Visualization..Apache-2
- FlashTorch (
🥉 16 ·⭐ 660 ·💀 ) - Visualization toolkit for neural networks in PyTorch! Demo --.MIT
- ExplainX.ai (
🥉 15 ·⭐ 310 ·💀 ) - Explainable AI framework for data scientists. Explain & debug any..MIT
- contextual-ai (
🥉 14 ·⭐ 80) - Contextual AI adds explainability to different stages of..Apache-2
- Attribution Priors (
🥉 12 ·⭐ 96 ·💀 ) - Tools for training explainable models using..MIT
- bias-detector (
🥉 11 ·⭐ 37) - Bias Detector is a python package for detecting bias in machine..MIT
Vector Similarity Search (ANN)
Libraries for Approximate Nearest Neighbor Search and Vector Indexing/Similarity Search.
Milvus (🥇 38 · ⭐ 10K) - An open-source vector database for scalable similarity search and AI.. Apache-2
-
GitHub (
👨💻 200 ·🔀 1.5K ·📥 23K ·📋 5K - 6% open ·⏱️ 05.05.2022):git clone https://github.com/milvus-io/milvus
-
PyPi (
📥 37K / month ·📦 16 ·⏱️ 02.04.2022):pip install pymilvus
-
Docker Hub (
📥 930K ·⭐ 18 ·⏱️ 02.04.2022):docker pull milvusdb/milvus
Faiss (🥇 34 · ⭐ 17K) - A library for efficient similarity search and clustering of dense vectors. MIT
Annoy (🥈 32 · ⭐ 9.8K) - Approximate Nearest Neighbors in C++/Python optimized for memory usage.. Apache-2
NMSLIB (🥈 31 · ⭐ 2.8K) - Non-Metric Space Library (NMSLIB): An efficient similarity search.. Apache-2
hnswlib (🥈 30 · ⭐ 2K) - Header-only C++/python library for fast approximate nearest neighbors. Apache-2
PyNNDescent (🥉 28 · ⭐ 610) - A Python nearest neighbor descent for approximate nearest neighbors. BSD-2
N2 (🥉 18 · ⭐ 510 · 💤 ) - TOROS N2 - lightweight approximate Nearest Neighbor library which runs.. Apache-2
Show 3 hidden projects...
Probabilistics & Statistics
Libraries providing capabilities for probabilistic programming/reasoning, bayesian inference, gaussian processes, or statistics.
tensorflow-probability (🥇 37 · ⭐ 3.7K) - Probabilistic reasoning and statistical analysis in.. Apache-2
filterpy (🥈 32 · ⭐ 2.2K · 💤 ) - Python Kalman filtering and optimal estimation library. Implements.. MIT
pomegranate (🥉 28 · ⭐ 2.9K) - Fast, flexible and easy to use probabilistic modelling in Python. MIT
SALib (🥉 28 · ⭐ 590) - Sensitivity Analysis Library in Python. Contains Sobol, Morris, FAST, and.. MIT
Orbit (🥉 26 · ⭐ 1.4K) - A Python package for Bayesian forecasting with object-oriented design.. Apache-2
Baal (🥉 20 · ⭐ 580) - Library to enable Bayesian active learning in your research or labeling.. Apache-2
Show 7 hidden projects...
- pingouin (
🥈 29 ·⭐ 1K) - Statistical package in Python based on Pandas.❗️GPL-3.0
- Edward (
🥉 28 ·⭐ 4.7K ·💀 ) - A probabilistic programming language in TensorFlow. Deep..Apache-2
- PyStan (
🥉 25 ·⭐ 180) - PyStan, a Python interface to Stan, a platform for statistical modeling...ISC
- pyhsmm (
🥉 21 ·⭐ 510 ·💀 ) - Bayesian inference in HSMMs and HMMs.MIT
- scikit-posthocs (
🥉 21 ·⭐ 240) - Multiple Pairwise Comparisons (Post Hoc) Tests in Python.MIT
- Funsor (
🥉 20 ·⭐ 190) - Functional tensors for probabilistic programming.Apache-2
- ZhuSuan (
🥉 15 ·⭐ 2.1K ·💀 ) - A probabilistic programming library for Bayesian deep learning,..MIT
Adversarial Robustness
Libraries for testing the robustness of machine learning models against attacks with adversarial/malicious examples.
ART (🥇 34 · ⭐ 3K) - Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning.. MIT
-
GitHub (
👨💻 97 ·🔀 810 ·📦 210 ·📋 670 - 12% open ·⏱️ 05.05.2022):git clone https://github.com/Trusted-AI/adversarial-robustness-toolbox
-
PyPi (
📥 7.9K / month ·📦 6 ·⏱️ 22.04.2022):pip install adversarial-robustness-toolbox
-
Conda (
📥 8.9K ·⏱️ 10.01.2022):conda install -c conda-forge adversarial-robustness-toolbox
Foolbox (🥈 30 · ⭐ 2.2K) - A Python toolbox to create adversarial examples that fool neural networks.. MIT
CleverHans (🥈 29 · ⭐ 5.5K · 💤 ) - An adversarial example library for constructing attacks,.. MIT
TextAttack (🥈 29 · ⭐ 2K) - TextAttack is a Python framework for adversarial attacks, data.. MIT
AdvBox (🥉 19 · ⭐ 1.2K · 💤 ) - Advbox is a toolbox to generate adversarial examples that fool.. Apache-2
robustness (🥉 18 · ⭐ 690) - A library for experimenting with, training and evaluating neural.. MIT
Show 3 hidden projects...
GPU Utilities
Libraries that require and make use of CUDA/GPU system capabilities to optimize data handling and machine learning tasks.
CuPy (🥇 38 · ⭐ 6K) - NumPy & SciPy for GPU. MIT
-
GitHub (
👨💻 300 ·🔀 590 ·📥 30K ·📦 1K ·📋 1.7K - 22% open ·⏱️ 27.04.2022):git clone https://github.com/cupy/cupy
-
PyPi (
📥 26K / month ·📦 160 ·⏱️ 27.04.2022):pip install cupy
-
Conda (
📥 1.3M ·⏱️ 04.05.2022):conda install -c conda-forge cupy
-
Docker Hub (
📥 55K ·⭐ 7 ·⏱️ 27.04.2022):docker pull cupy/cupy
DALI (🥉 24 · ⭐ 3.8K) - A GPU-accelerated library containing highly optimized building blocks.. Apache-2
-
GitHub (
👨💻 74 ·🔀 480 ·📋 1.1K - 14% open ·⏱️ 04.05.2022):git clone https://github.com/NVIDIA/DALI
scikit-cuda (🥉 24 · ⭐ 890) - Python interface to GPU-powered libraries. BSD-3
BlazingSQL (🥉 22 · ⭐ 1.7K · 💤 ) - BlazingSQL is a lightweight, GPU accelerated, SQL engine for.. Apache-2
Vulkan Kompute (🥉 20 · ⭐ 850) - General purpose GPU compute framework built on Vulkan to.. Apache-2
cuSignal (🥉 19 · ⭐ 590) - GPU accelerated signal processing. Apache-2
-
GitHub (
👨💻 38 ·🔀 91 ·📋 130 - 11% open ·⏱️ 22.04.2022):git clone https://github.com/rapidsai/cusignal
Show 5 hidden projects...
- GPUtil (
🥉 23 ·⭐ 850 ·💀 ) - A Python module for getting the GPU status from NVIDA GPUs using..MIT
- py3nvml (
🥉 21 ·⭐ 200) - Python 3 Bindings for NVML library. Get NVIDIA GPU status inside your..BSD-3
- nvidia-ml-py3 (
🥉 20 ·⭐ 77 ·💀 ) - Python 3 Bindings for the NVIDIA Management Library.BSD-3
- SpeedTorch (
🥉 15 ·⭐ 650 ·💀 ) - Library for faster pinned CPU - GPU transfer in Pytorch.MIT
- ipyexperiments (
🥉 14 ·⭐ 150) - jupyter/ipython experiment containers for GPU and..Apache-2
Tensorflow Utilities
Libraries that extend TensorFlow with additional capabilities.
TensorFlow Datasets (🥇 35 · ⭐ 3.2K) - TFDS is a collection of datasets ready to use with.. Apache-2
tensorflow-hub (🥇 35 · ⭐ 3.1K) - A library for transfer learning by reusing parts of.. Apache-2
tensor2tensor (🥈 34 · ⭐ 12K) - Library of deep learning models and datasets designed to.. Apache-2
TF Model Optimization (🥈 32 · ⭐ 1.2K) - A toolkit to optimize ML models for deployment for.. Apache-2
TensorFlow Transform (🥈 32 · ⭐ 920) - Input pipeline framework. Apache-2
Keras-Preprocessing (🥉 29 · ⭐ 1K) - Utilities for working with image data, text data, and.. MIT
TensorFlow I/O (🥉 29 · ⭐ 550) - Dataset, streaming, and file system extensions.. Apache-2
efficientnet (🥉 26 · ⭐ 2K · 💤 ) - Implementation of EfficientNet model. Keras and.. Apache-2
Neural Structured Learning (🥉 26 · ⭐ 920) - Training neural models with structured signals. Apache-2
TensorFlow Cloud (🥉 24 · ⭐ 320) - The TensorFlow Cloud repository provides APIs that.. Apache-2
TF Compression (🥉 21 · ⭐ 600) - Data compression in TensorFlow. Apache-2
Show 1 hidden projects...
- TensorNets (
🥉 20 ·⭐ 1K ·💀 ) - High level network definitions with pre-trained weights in..MIT
Jax Utilities
Libraries that extend Jax with additional capabilities.
Show 1 hidden projects...
- jaxdf (
🥉 8 ·⭐ 48) - A JAX-based research framework for writing differentiable..❗️LGPL-3.0
Sklearn Utilities
Libraries that extend scikit-learn with additional capabilities.
imbalanced-learn (🥇 33 · ⭐ 5.8K) - A Python Package to Tackle the Curse of Imbalanced.. MIT
category_encoders (🥇 33 · ⭐ 1.9K) - A library of sklearn compatible categorical variable.. BSD-3
scikit-opt (🥈 25 · ⭐ 3.2K) - Genetic Algorithm, Particle Swarm Optimization, Simulated.. MIT
fancyimpute (🥈 25 · ⭐ 1.1K · 💤 ) - Multivariate imputation and matrix completion.. Apache-2
scikit-lego (🥈 25 · ⭐ 810) - Extra blocks for scikit-learn pipelines. MIT
sklearn-contrib-lightning (🥉 23 · ⭐ 1.6K) - Large-scale linear classification, regression and.. BSD-3
-
GitHub (
👨💻 17 ·🔀 200 ·📥 230 ·📦 100 ·📋 93 - 54% open ·⏱️ 30.01.2022):git clone https://github.com/scikit-learn-contrib/lightning
-
PyPi (
📥 1.8K / month ·📦 6 ·⏱️ 30.01.2022):pip install sklearn-contrib-lightning
-
Conda (
📥 160K ·⏱️ 13.11.2021):conda install -c conda-forge sklearn-contrib-lightning
iterative-stratification (🥉 22 · ⭐ 650) - scikit-learn cross validators for iterative.. BSD-3
scikit-tda (🥉 17 · ⭐ 340) - Topological Data Analysis for Python. MIT
Show 6 hidden projects...
- sklearn-crfsuite (
🥈 26 ·⭐ 400 ·💀 ) - scikit-learn inspired API for CRFsuite.MIT
- scikit-multilearn (
🥉 24 ·⭐ 740 ·💀 ) - A scikit-learn based module for multi-label et. al...BSD-2
- skope-rules (
🥉 21 ·⭐ 460 ·💀 ) - machine learning with logical rules in Python.❗️BSD-1-Clause
- celer (
🥉 19 ·⭐ 140) - Fast solver for L1-type problems: Lasso, sparse Logisitic regression,..BSD-3
- skggm (
🥉 17 ·⭐ 200) - Scikit-learn compatible estimation of general graphical models.MIT
- dabl (
🥉 16 ·⭐ 110 ·💤 ) - Data Analysis Baseline Library.BSD-3
Pytorch Utilities
Libraries that extend Pytorch with additional capabilities.
PML (🥇 33 · ⭐ 4.4K) - The easiest way to use deep metric learning in your application. Modular,.. MIT
-
GitHub (
👨💻 24 ·🔀 540 ·📦 260 ·📋 350 - 13% open ·⏱️ 30.03.2022):git clone https://github.com/KevinMusgrave/pytorch-metric-learning
-
PyPi (
📥 910K / month ·📦 9 ·⏱️ 02.04.2022):pip install pytorch-metric-learning
-
Conda (
📥 6.7K ·⏱️ 30.03.2022):conda install -c metric-learning pytorch-metric-learning
accelerate (🥇 32 · ⭐ 2.4K) - A simple way to train and use PyTorch models with multi-.. Apache-2
lightning-flash (🥇 29 · ⭐ 1.5K) - Your PyTorch AI Factory - Flash enables you to easily.. Apache-2
pytorch-summary (🥈 28 · ⭐ 3.5K · 💤 ) - Model summary in PyTorch similar to `model.summary()`.. MIT
pytorch-optimizer (🥈 28 · ⭐ 2.4K) - torch-optimizer -- collection of optimizers for.. Apache-2
torchdiffeq (🥈 25 · ⭐ 4.1K) - Differentiable ODE solvers with full GPU support and.. MIT
torch-scatter (🥈 24 · ⭐ 970) - PyTorch Extension Library of Optimized Scatter Operations. MIT
PyTorch Sparse (🥈 24 · ⭐ 620) - PyTorch Extension Library of Optimized Autograd Sparse.. MIT
SRU (🥈 23 · ⭐ 2K · 💤 ) - Training RNNs as Fast as CNNs (https://arxiv.org/abs/1709.02755). MIT
EfficientNets (🥈 23 · ⭐ 1.5K · 💤 ) - Pretrained EfficientNet, EfficientNet-Lite, MixNet,.. Apache-2
Pytorch Toolbelt (🥈 23 · ⭐ 1.2K) - PyTorch extensions for fast R&D prototyping and Kaggle.. MIT
Performer Pytorch (🥉 21 · ⭐ 820) - An implementation of Performer, a linear attention-based.. MIT
reformer-pytorch (🥉 20 · ⭐ 1.7K · 📉 ) - Reformer, the efficient Transformer, in Pytorch. MIT
Torch-Struct (🥉 19 · ⭐ 1K) - Fast, general, and tested differentiable structured prediction.. MIT
pytorchviz (🥉 18 · ⭐ 2.2K · 💤 ) - A small package to create visualizations of PyTorch execution.. MIT
-
GitHub (
👨💻 6 ·🔀 220 ·📦 650 ·📋 54 - 35% open ·⏱️ 15.06.2021):git clone https://github.com/szagoruyko/pytorchviz
tinygrad (🥉 17 · ⭐ 6K) - You like pytorch? You like micrograd? You love tinygrad!. MIT
-
GitHub (
👨💻 58 ·🔀 600 ·📦 2 ·📋 110 - 19% open ·⏱️ 05.04.2022):git clone https://github.com/geohot/tinygrad
Tensor Sensor (🥉 17 · ⭐ 640) - The goal of this library is to generate more helpful.. MIT
Show 8 hidden projects...
- pretrainedmodels (
🥇 32 ·⭐ 8.5K ·💀 ) - Pretrained ConvNets for pytorch: NASNet, ResNeXt,..BSD-3
- EfficientNet-PyTorch (
🥈 27 ·⭐ 6.9K ·💀 ) - A PyTorch implementation of EfficientNet and..Apache-2
- Poutyne (
🥉 22 ·⭐ 520) - A simplified framework and utilities for PyTorch.❗️LGPL-3.0
- AdaBound (
🥉 20 ·⭐ 2.9K ·💀 ) - An optimizer that trains as fast as Adam and as good as SGD.Apache-2
- Antialiased CNNs (
🥉 20 ·⭐ 1.5K ·💤 ) - pip install antialiased-cnns to improve stability and..❗️CC BY-NC-SA 4.0
- Lambda Networks (
🥉 17 ·⭐ 1.5K ·💀 ) - Implementation of LambdaNetworks, a new approach to..MIT
- micrograd (
🥉 16 ·⭐ 2K ·💀 ) - A tiny scalar-valued autograd engine and a neural net library..MIT
- TorchDrift (
🥉 13 ·⭐ 210 ·💤 ) - Drift Detection for your PyTorch Models.Apache-2
Database Clients
Libraries for connecting to, operating, and querying databases.
Others
scipy (🥇 49 · ⭐ 9.5K) - Ecosystem of open-source software for mathematics, science, and engineering. BSD-3
PyOD (🥇 35 · ⭐ 5.6K) - A Comprehensive and Scalable Python Library for Outlier Detection (Anomaly.. BSD-2
PennyLane (🥈 30 · ⭐ 1.2K) - PennyLane is a cross-platform Python library for differentiable.. Apache-2
agate (🥈 30 · ⭐ 1.1K · 💤 ) - A Python data analysis library that is optimized for humans instead of.. MIT
pyjanitor (🥈 30 · ⭐ 910 · 📈 ) - Clean APIs for data cleaning. Python implementation of R package.. MIT
adapter-transformers (🥈 30 · ⭐ 790) - Huggingface Transformers + Adapters =. Apache-2
huggingface
causalml (🥈 29 · ⭐ 3K) - Uplift modeling and causal inference with machine learning algorithms. Apache-2
TabPy (🥈 29 · ⭐ 1.2K) - Execute Python code on the fly and display results in Tableau visualizations:. MIT
alibi-detect (🥉 27 · ⭐ 1.3K) - Algorithms for outlier, adversarial and drift detection. Apache-2
metric-learn (🥉 26 · ⭐ 1.2K) - Metric learning algorithms in Python. MIT
avalanche (🥉 26 · ⭐ 890) - Avalanche: an End-to-End Library for Continual Learning based on PyTorch. MIT
Feature Engine (🥉 23 · ⭐ 860 · 💤 ) - Feature engineering package with sklearn like functionality. BSD-3
StreamAlert (🥉 21 · ⭐ 2.7K) - StreamAlert is a serverless, realtime data analysis framework.. Apache-2
-
GitHub (
👨💻 33 ·🔀 320 ·📋 340 - 24% open ·⏱️ 04.11.2021):git clone https://github.com/airbnb/streamalert
opyrator (🥉 20 · ⭐ 2.6K · 💤 ) - Turns your machine learning code into microservices with web API,.. MIT
SUOD (🥉 19 · ⭐ 320) - (MLSys 21) An Acceleration System for Large-scare Unsupervised Heterogeneous.. BSD-2
apricot (🥉 17 · ⭐ 420) - apricot implements submodular optimization for the purpose of selecting.. MIT
Show 17 hidden projects...
- Cython BLIS (
🥈 30 ·⭐ 190) - Fast matrix-multiplication as a self-contained Python library no..BSD-3
- pysc2 (
🥈 28 ·⭐ 7.5K ·💀 ) - StarCraft II Learning Environment.Apache-2
- minisom (
🥉 27 ·⭐ 1.1K) - MiniSom is a minimalistic implementation of the Self Organizing..❗️CC-BY-3.0
- pyclustering (
🥉 27 ·⭐ 950 ·💀 ) - pyclustring is a Python, C++ data mining library.BSD-3
- cleanlab (
🥉 26 ·⭐ 3.3K) - The standard data-centric AI package for data quality and machine..❗️AGPL-3.0
- modAL (
🥉 24 ·⭐ 1.7K ·💀 ) - A modular active learning framework for Python.MIT
- MONAILabel (
🥉 23 ·⭐ 230) - MONAI Label is an intelligent open source image labeling and..Apache-2
- vecstack (
🥉 22 ·⭐ 660 ·💀 ) - Python package for stacking (machine learning technique).MIT
- mlens (
🥉 21 ·⭐ 730 ·💀 ) - ML-Ensemble high performance ensemble learning.MIT
- scikit-rebate (
🥉 19 ·⭐ 360 ·💀 ) - A scikit-learn-compatible Python implementation of..MIT
- baikal (
🥉 18 ·⭐ 590 ·💀 ) - A graph-based functional API for building complex scikit-learn..BSD-3
- rrcf (
🥉 18 ·⭐ 370 ·💀 ) - Implementation of the Robust Random Cut Forest algorithm for anomaly..MIT
- pandas-ml (
🥉 18 ·⭐ 290 ·💀 ) - pandas, scikit-learn, xgboost and seaborn integration.BSD-3
- NeuralCompression (
🥉 15 ·⭐ 250) - A collection of tools for neural compression enthusiasts.MIT
- traingenerator (
🥉 13 ·⭐ 1.2K ·💀 ) - A web app to generate template code for machine learning.MIT
- nylon (
🥉 11 ·⭐ 77 ·💤 ) - An intelligent, flexible grammar of machine learning.MIT
- dstack (
🥉 11 ·⭐ 34 ·🐣 ) - dstack: the modern CI/CD made for training models.❗️GPL-3.0
Related Resources
- Papers With Code: Discover ML papers, code, and evaluation tables.
- Sotabench: Discover & compare open-source ML models.
- Google Dataset Search: Dataset search engine by Google.
- Dataset List: List of the biggest ML datasets from across the web.
- Awesome Public Datasets: A topic-centric list of open datasets.
- Best-of lists: Discover other best-of lists with awesome open-source projects on all kinds of topics.
- best-of-python-dev: A ranked list of awesome python developer tools and libraries.
- best-of-web-python: A ranked list of awesome python libraries for web development.
Contribution
Contributions are encouraged and always welcome! If you like to add or update projects, choose one of the following ways:
- Open an issue by selecting one of the provided categories from the issue page and fill in the requested information.
- Modify the projects.yaml with your additions or changes, and submit a pull request. This can also be done directly via the Github UI.
If you like to contribute to or share suggestions regarding the project metadata collection or markdown generation, please refer to the best-of-generator repository. If you like to create your own best-of list, we recommend to follow this guide.
For more information on how to add or update projects, please read the contribution guidelines. By participating in this project, you agree to abide by its Code of Conduct.