Giter Site home page Giter Site logo

sklearn-package-template's Introduction

Scikit-learn package template

This repository aims at providing a package to train a scikit-learn model (pipeline) and exploit it for prediction. For this purpose, it contains:

  • A machine learning package: scikit-learn-template/model_package
  • A CLI to train it from the command line
  • Example scripts to train the model and generate predictions.

Installation

This project requires python 3.9.

To install required packages, go to scikit-learn-template and run:

poetry install

Usage

For all the following, make sure you are in the model_package directory.

Training

Demo script

Run the following in a terminal:

./demo-train.sh  

CLI

Input arguments:

--train             Path to the csv file for training
--label_column      Name of the label column. Default is 'label'.
--user_id_column    Name of the user id column. Default is 'UserId'. This column will not be used in training.
--model_path        Path to store the model or to load it from.
--evaluation_folds  How many folds to use to evaluate. If not provided, no evaluation is performed.

Example:

python scikit-learn-template --train your_file.csv --model_path saved_model.joblib --evaluation_folds 4

Predictions

Demo script

Run the following in a terminal:

./demo-predict.sh  

CLI

Input arguments:

--predict           Path to the csv file for prediction.
--user_id_column    Name of the user id column. Default is 'UserId'. This column will not be used in training.
--model_path        Path to store the model or to load it from.

Note that you can, with one single command, train and generate predictions.

Example:

python scikit-learn-template --train your_file.csv --predict your_file_no_label.csv --predictions_path predictions.csv --evaluation_folds 4

Code structure

sklearn-package-template
│   README.md: this file documenting the project
│   demo-train.sh: runable demo shell script showing how to use the cli to train and save a model. 
│   demo-predict.sh: demo shell script to generate predictions 
│
└───scikit-learn-template: contains the model package and the cli
│   │   __main__.py: command line interface (CLI)
│   │   api_exceptions.py: error classes for the CLI
│   │   poetry.lock: poetry configuration for packages
│   │   pyproject.toml: poetry configuration file
│   │
│   └───model_package: model package
│   │   │   __init__.py: exposes main classes, methods and variables
│   │   │   load_model.py: contains a function to reload saved models
│   │   │   model.py: model definition
│   │   │   version.py: contains the __version__ variable.
│   │
│   └───tests: tests for the detector package (mirrors its structure)
│   │   │   test_load_model.py: tests model loading
│   │   │   test_model.py: tests model building

Running the tests

Coverage for the model_package is 100% files and 100% lines. However, the API is untested.

To run the tests, go to the scikit-learn-template directory and run:

export PYTHONPATH=$(pwd):${PYTHONPATH}  # Make sure it finds the package
pytest tests

sklearn-package-template's People

Contributors

gtregoat avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.