Instructor: | Alessandro Gagliardi |
EiRs: | Ramesh Sampath |
Otto Stegmaier | |
Alex Chao | |
Classes: | 6:30pm-9:30pm, Tuesday and Thursdays |
January 15 – March 31 | |
Office Hours: | Alex Chao, 5:30 - 6:30 before class at GA |
Otto Stegmaier, 9:30 - 10:00 after class at GA | |
Ramesh Sampath, 4:00 - 6:00 Saturdays remote | |
Can also set by appointment |
Homework is to be submitted by posting it to your own github repo. Then post the URL and folder where the homework lives at here.
- Intro to Data Science, Relational Databases & SQL
- Getting started with IPython & Git
- APIs and semi-structured data
- IPython.parallel & StarCluster
- Hadoop Distributed File System and Spark
- Intro to ML: k-Nearest Neighbor Classification
- Nearest Neighbor Methods (PDF Slides)
- k-Nearest Neighbor Classification Algorithm (YouTube Video)
- K Nearest Neighbors (Coursera Video)
- KNN for Humans
- Intuitive Classification using KNN and Python
- Nearest Neighbors Classification (scikit-learn documentation)
- Should I normalize/standardize/rescale the data?
- Clustering: Hierarchical and K-Means
- A Tutorial on Clustering Algorithms (web tutorial)
- Hierarchical Clustering in Action
- K-means (scikit-learn documentation)
- Clustering: k-means (PDF Slides)
- Clustering: Hierarchical Clustering (PDF Slides)
- Probability, A/B Tests & Statistical Significance
- Probability and Statistics (Khan Academy Course)
- What’s a good value for R-squared?
- Visualizing Distributions of Data
- Multiple Linear Regression and ANOVA
- Logistic Regression and Generlized Linear Models
- Project Elevator Pitches
- See Student Project Repos below
- Naïve Bayes, Cross Validation, ROC, AUC & Midterm Review - Part I
- Bayes' Theorem with Lego
- Probabilistic Programming and Bayesian Methods for Hackers
- Doing Naive Bayes Classification
- Receiver operating characteristic (wikipedia article)
- Receiver Operating Characteristic (ROC) (scikit-learn documentation)
- Naïve Bayes, Cross Validation, ROC, AUC - Part II
- Principal Components Analysis
- Decision Trees and Forests
- Decision Tree Learning (Wikipedia article)
- Decision Trees
- How to construct a tree
- Information gain
- Support Vector Machines
- Support Vector Machines (scikit-learn documentation)
- A User's Guide to Support Vector Machines
- Scaling Out
- Recommendation Systems
- Visualization
- Final Project Presentations (12 min. each)
- Final Project Presentations (12 min. each)
- Future Directions
Date | Due | Returned |
---|---|---|
1/22 | Preliminary Project Proposals Due (3-4 sentences) | |
1/27 | Homework 1 | |
1/29 | EiR Feedback on Project Proposals | |
2/3 | EiR Feedback on Homework 1 | |
2/5 | Formal Proposals (including data and methods chosen) | |
2/10 | Homework 2 Assigned | |
2/12 | EiR Feedback on Formal Proposals | |
2/17 | Homework 2 Due | |
2/19 | Project Elevator Pitch in class (4 minutes each) | Project Live on Github |
2/24 | Homework 3 Assigned | |
2/26 | Peer Feedback of Projects | Peer Feedback on Project |
3/3 | Midterm Assessment Posted | |
3/12 | Midterm Assessment Due | |
3/17 | At least one working model | |
3/24-26 | Final Presentations (12 minutes each) | Midterm Graded |
| Student | Repo | | Ajay Anand | sryballin/GeneralAssembly-DS | | Zachary Cousens | zfcousens/DAT_SF_12/tree/gh-pages/Project | | Carmen Diaz Echauri | cde/? | | Deepthi Duddempudi | DeepthiGA/Project | | Vijay Duraipalam | coolcalguy/DAT_SF_12/tree/gh-pages/Project | | Cheong-tseng Eng | ctteng/GA-Proj-GPSAnomalyDetection.git | | David Feng | selwyth/neighborhood | | Isabel Friedman | isabitz/whales | | Dave Halvorson | git-halvorson/DAT_SF_12/tree/gh-pages/FinalProject | | Alison Harmon | alharmon13/DAT_SF_12/tree/gh-pages/project | | Markus Huber | mbhuber/USconsumers | | Ryan Hughes | cryhughes/AVS-Kaggle | | Tania Ibanez | positiveepsilon/GA_Project | | Roxana Ordonez | rockyroxana/bike-share-forecast.git | | Justin Peterson | justinrpeterson/? | | April Song | khsong92/ga_ds | | India Swearingen | iswearingen/DAT_SF_12/blob/gh-pages/Homework/Project-IS-load-data.ipynb | | Bing Wang | bingbingboo/DAT_SF_12/blob/gh-pages/Homework/2014flightdatalab.ipynb | | Jaime Williams | jawilliams3000/OaklandCrime | | David Yerrington | dyerrington/Rapstats | | Matt Jones | jonesmatt415/NCAA-Prediction-Project- |