Giter Site home page Giter Site logo

ngcm_sklearn_2017's Introduction

Machine Learning using scikit-learn

NGCM Summer Academy 2017

Instructors

Christopher Fonnesbeck (Vanderbilt University Medical Center)
Skipper Seabold (Civis Analytics)

Outline

Thursday, June 29

09:30 - 10:45 (Chris Fonnesbeck)

Introduction to machine learning with scikit-learn

11:00 - 13:15 (Chris Fonnesbeck)

Unsupervised learning

13:15 - 14:15 Lunch

14:15 - 16:00

Supervised Learning (Skipper Seabold)

16:15 - 17:30

Model selection (Skipper Seabold)

Friday, June 30

09:30 - 10:45

Supervised Learning (Chris Fonnesbeck)

11:00 - 13:15 (Chris Fonnesbeck)

Ensemble Supervised Learning

13:15 - 14:15 Lunch

14:15 - 16:00 (Skipper Seabold)

Advanced topics

16:15 - 17:30 (Skipper Seabold)

Advanced topics

Prerequisites

This is an intermediate-level computing course, so some previous experience with Python is required. Some undergraduate-level statistics is also recommended.

Software Requirements

Python 3.5 or 3.6. We recommend installing the free Anaconda distribution of Python, available from Continuum Analytics.

The following packages should be installed on your system:

  • jupyter
  • ipython>=4.0
  • numpy>=1.10
  • pandas>=0.18
  • scipy
  • matplotlib
  • scikit-learn
  • dask

If you have installed Anaconda, most of these may already be available to you.

Getting this repository

git clone https://github.com/fonnesbeck/ngcm_sklearn_2017.git

If you are not familiar with Git and GitHub, you can simply download the zip file of the repository at the top of the main repository page.

Then, move to the directory created by the clone/zip file:

cd ngcm_sklearn_2017

and install everything using conda:

conda config --add channels conda-forge
conda env create -f environment.yml

This will create an environment called sklearn that includes the packages required for the course.
โ€‹
If you are not using the Anaconda Python distribution, you will need to manually install the packages listed in environment.yml using pip.

Which you probably don't want to do.

So install Anaconda.

To use the environment, you may type:

source activate sklearn

ngcm_sklearn_2017's People

Contributors

fonnesbeck avatar jseabold avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

ngcm_sklearn_2017's Issues

Finalize examples and hands-on exercises

We need to identify datasets and examples for use in the class. I've added ones that I've used in my graduate class, but they may not be optimal.

A good recurring case study or two might be a good idea.

Add attribution and license

I've taken quite a few examples from Andy and the sklearn user guide. Should note this somewhere and include appropriate licenses.

Add additional course content

Here is an outline with a todo list for various sections. Please edit as needed and update status. They can be checked off if they are mostly there, as I'm sure we will tinker until the course starts.

Thursday morning

  • Introduction to sklearn API
  • Data preprocessing
  • Dimensionality reduction
  • Clustering

Thursday afternoon

  • Support vector machines
  • Decision trees
  • Random forests

Friday morning

  • Linear regression
  • Logistic regression
  • Model selection
  • Non-linear regression
  • Boosting
  • Regression trees

Friday afternoon

  • Pipelining
  • Feature selection
  • Text feature extraction
  • Parallelism with Dask
  • Capstone project

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.