Giter Site home page Giter Site logo

data-science-tutorials's Introduction

data-science-tutorials

This repository contains jupyter-notebooks to accompany the tutorials for our data science lectures. The following topics are covered (each within a separate folder).

  1. Dataset Visualization (Boston Housing minus the linear regression; also other datasets like Flower, MNIST-digits, 20newsgroups) working/visualizing one dataset (incl. Matplotlib; .describe attribute; box-plot, min-max-normilization; boston housing; linear reg c/o dsP)
  2. Clustering
  3. Association Rule Learning (dataset yet to be determined; preferably from scikit learn)
  4. Regression (linear regression from Boston Housing and Car Prices)
  5. Bayes Learning (for spam filtering/text classification)
  6. Classification with Decision Trees (start with small 5-line dataset)
  7. Neural Networks (use keras.io to build a neural network for MNIST-digit classification) keras (for MNIST class); OPT gensim (for word2vec; pick dataset from tensorflow); then auto-encoder for representatino learning
  8. OPTIONAL MapReduce

Packages

See our python-tutorials on instructions how to set this up on your machine.

required

optional

  • Pandas; [documentation] also as pdf

Table of contents

  • 0-Intro

    • Scikit-learn-overview.ipynb
    • Web Mining Project .ipynb
  • 1-Datasets_Visualization_and_preprocessing

    • 1-IRIS.ipynb
    • 2-Boston_house_dataset.ipynb
    • 3-MNIST.ipynb
    • 4-UCI_CAR.ipynb
    • 5-20newsgroups.ipynb
    • 6-KDD_cup_2000_data_set.ipynb
    • Crawling_twitter_with_python.ipynb
    • MDS_projection.ipynb (IRIS)
    • PCA_projection.ipynb (IRIS)
    • scikit-learn-overview-and-preprocessing.ipynb (IRIS)
    • VA-InformationVisualisation-with-JavaScript-and-3DJs.ipynb
    • TODO try visualization with Orange (available through the conda-forge channel)
  • 2-Clustering

    • Clustering_overview.ipynb (IRIS) (MNIST)
    • Tutorial_clustering_for_outlier_detection_3D.ipynb (Kddcup 1999)
    • Tutorial_clustering_for_outlier_detection.ipynb (Kddcup 1999)
  • 3-Association-Rules

    • Apriori_asaini.ipynb (MBE_dataset)
    • Apriori.ipynb (Boston house)
    • Apriori_server.ipynb (Mango_dataset)
    • Assignment_Association_rule_learning.ipynb
    • Tutorial_association_rule_learning_shopping_basket.ipynb (KDDcup 2000)
  • 4-Linear_regression_and_logistic_regression

    • Assignment_Linear_Regression.ipynb
    • Assignment_Logistic_regression.ipynb (UCI_car)
    • Boston_house_Linear_Regression.ipynb (Boston house)
    • Linear_regression_diabetes_dataset.ipynb
    • Linear-Regression.ipynb (Boston house)
    • Logistic_regression.ipynb (IRIS)
    • Small_scale_linear_regression.ipynb (KDDcup)
    • Supervised_Learning_with_Linear_Models.ipynb (Boston house)
  • 5-KNN_classification

    • KNN_classification.ipynb (IRIS)
    • Metrics.ipynb (IRIS)
  • 6-Bayes-Learning

  • 7-Decision-Trees.ipynb (UCI_car)

  • 8-Neural-Networks

    • keras-mnist.ipynb (MNIST)
    • Simple-NN.ipynb (make_moons)
    • Stacked-Denoising-Autoencoders.ipynb
    • INFO Software Comparison
      • keras.io (high-level, running on top of TensorFlow (default) or Theano) c/o Francois Chollet (written in Python)
      • Theano c/o Universite de Montreal (written in Python; tightly integrated with NumPy)
      • TensorFlow c/o Google Brain (written in Python/C++)
  • 9-SVM

    • Assignment_SVM_for_OCR.ipynb (MNIST)
    • Support_Vector_Machines.ipynb (IRIS)
  • A-Advanced_modules

    • NLP-with-NLTK-Short-Intro.ipynb
  • B-Scripts

Links

Cheat Sheets

Other Collections

Module Specific

(should be listed at the module)

data-science-tutorials's People

Contributors

nadimmajed avatar zieglerk avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.