Opinionated Cookiecutter template for ML Python package based on Hypermodern Python Cookiecutter.
This is designed with reproducibility, distribution, and easy data wrangling and exploration in mind. Many different ml repos have differing: data structures, training loops, visualizations, and deployment.
- fiftyone is used to ensure standardization of formats. Many different datasets for the same task have different formats. Fiftyone fixes this since it has built many integrations for importing and exporting data. This makes data loading, and visualization much easier.
- lightning is used to put structure to machine learning code. It has a standard control flow where it is easy to learn. Lightning also has many integrations and abstractions that make training much more efficient and scalable.
- wandb is used to help visualize the actual training process. It allows for powerful and custom visualization needs and experiment comparison.
- Lastly, environment management is one of the biggest issues with ml. Too many machine learning repos do not have a docker container which makes cloning and using the project more difficult. Everything should be ran in containers unless there is a specific reason not to.
$ cookiecutter https://github.com/ChickenTarm/cookiecutter-python-ml-project.git
- Containerization and templated deployment services with Docker
- Data management and visualization with fiftyone and mongodb
- Packaging and dependency management with Poetry
- Test automation with Nox
- Linting with pre-commit and Flake8
- Continuous integration with GitHub Actions
- Documentation with Sphinx, MyST, and Read the Docs using the furo theme
- Automated uploads to PyPI and TestPyPI
- Automated dependency updates with Dependabot
- Code formatting with Black and Prettier
- Import sorting with isort
- Testing with pytest
- Code coverage with Coverage.py
- Coverage reporting with Codecov
- Command-line interface with Click
- Static type-checking with mypy
- Runtime type-checking with Typeguard
- Automated Python syntax upgrades with pyupgrade
- Security audit with Bandit and Safety
- Check documentation examples with xdoctest
- Generate API documentation with autodoc and napoleon
- Generate command-line reference with sphinx-click
The template supports Python 3.7, 3.8, 3.9, and 3.10.