A comparative study on the use and applicability of automated machine learning pipelines using scikit-learn and Keras.
This project was built using Anaconda Distribution 4.5.10
with Python Version 3.6.5
.
Run pip install -r requirements.txt
to ensure all necessary packages are the correct version (a virtual environment is recommended).
Ensemble Model
The ensemble pipeline model can be configured and trained in the runEnsembleBaseline.py file. Simply edit the configurations in that file and run using python runEnsembleBaseline.py
.
Deep Learning Model
The deep learning pipeline model can be configured and trained in the runDeepLearningCPU.py and runDeepLearningGPU.py files. Simply edit the configurations in that file and run using python runDeepLearningCPU.py
for CPU run and python runDeepLearningGPU.py
for GPU run.
- Input data must be within the
data/input/
directory. Download data from the Kaggle website (https://www.kaggle.com/c/allstate-claims-severity/data) - Predictions are written to
data/predictions/
directory. - Models, driver rankings and other helpful files are stored in the
models/ensemble/
andmodels/deep_learning/
directories.