Giter Site home page Giter Site logo

ml_fundamentals_predict_bike_sharing_demand_autogluon's Introduction

Predict Bike Sharing Demand with AutoGluon

Introduction to AWS Machine Learning Final Project

Overview

In this project, students will apply the knowledge and methods they learned in the Introduction to Machine Learning course to compete in a Kaggle competition using the AutoGluon library.

Students will create a Kaggle account if they do not already have one, download the Bike Sharing Demand dataset, and train a model using AutoGluon. They will then submit their initial results for a ranking.

After they complete the first workflow, they will iterate on the process by trying to improve their score. This will be accomplished by adding more features to the dataset and tuning some of the hyperparameters available with AutoGluon.

Finally they will submit all their work and write a report detailing which methods provided the best score improvement and why. A template of the report can be found here.

To meet specifications, the project will require at least these files:

  • Jupyter notebook with code run to completion
  • HTML export of the jupyter notebbook
  • Markdown or PDF file of the report

Images or additional files needed to make your notebook or report complete can be also added.

Getting Started

  • Clone this template repository git clone [email protected]:udacity/nd009t-c1-intro-to-ml-project-starter.git into AWS Sagemaker Studio (or local development).

sagemaker-studio-git1.png

sagemaker-studio-git2.png

  • Proceed with the project within the jupyter notebook.
  • Visit the Kaggle Bike Sharing Demand Competition page. There you will see the overall details about the competition including overview, data, code, discussion, leaderboard, and rules. You will primarily be focused on the data and ranking sections.

Dependencies

Python 3.7
MXNet 1.8
Pandas >= 1.2.4
AutoGluon 0.2.0 

Installation

For this project, it is highly recommended to use Sagemaker Studio from the course provided AWS workspace. This will simplify much of the installation needed to get started.

For local development, you will need to setup a jupyter lab instance.

  • Follow the jupyter install link for best practices to install and start a jupyter lab instance.
  • If you have a python virtual environment already installed you can just pip install it.
pip install jupyterlab

Project Instructions

  1. Create an account with Kaggle.
  2. Download the Kaggle dataset using the kaggle python library.
  3. Train a model using AutoGluon’s Tabular Prediction and submit predictions to Kaggle for ranking.
  4. Use Pandas to do some exploratory analysis and create a new feature, saving new versions of the train and test dataset.
  5. Rerun the model and submit the new predictions for ranking.
  6. Tune at least 3 different hyperparameters from AutoGluon and resubmit predictions to rank higher on Kaggle.
  7. Write up a report on how improvements (or not) were made by either creating additional features or tuning hyperparameters, and why you think one or the other is the best approach to invest more time in.

License

License

ml_fundamentals_predict_bike_sharing_demand_autogluon's People

Contributors

kunal-garg-12 avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.