Giter Site home page Giter Site logo

ds-masterclass-hands-on's Introduction

ds-masterclass-hands-on

The folder

  1. session-1 : contains code for session on Intrusion Detection, divided into R and python code folders.
  2. session-2 : contains code for session on Text classification, divided into R and python code folders.

Public dropbox folder containing data, problem description, data dictionaries

https://goo.gl/PEug5P

For participants who will be using python

  1. Anaconda distribution for python : Go to https://www.continuum.io/downloads and download the latest Anaconda distribution. Please use Python 2.7 installation.

  2. Run conda install -c anaconda seaborn

  3. Run conda install -c glemaitre imbalanced-learn

  4. Install the libraries listed below using pip.

Steps to install a library in python.

  1. Go to terminal/command-prompt.
  2. Run pip install <library name>
  3. For instance, to install numpy, you’d run pip install numpy

List of libraries used in the hands-on session
Session 1 : Intrusion detection

  1. numpy
  2. pandas
  3. matplotlib
  4. seaborn
  5. sklearn
  6. imblearn
  7. xgboost

Session 2 : News articles recommender

  1. numpy
  2. pandas
  3. sklearn
  4. nltk 3.2.4
  5. Install nltk corpus and model:
    > import nltk
    > nltk.download('stopwords')
    > nltk.download('punkt')
    > nltk.download('maxent_ne_chunker')
    > nltk.download('averaged_perceptron_tagger')
    > nltk.download('words')   
    
  6. gensim 0.12.4
    conda install -c anaconda gensim
    

For participants who will be using R

  1. Set up R : Go to https://cran.rstudio.com/ and download R for your OS. Please download R version >=3.4.1
  2. Set up R Studio : Go to https://www.rstudio.com/products/rstudio/ and download open source version of RStudio Desktop.
  3. Install the libraries listed below.

Steps to install a library in RStudio

  1. Open RStudio.
  2. In the console, run install.packages(“<library name>”)
  3. For instance, to install ggplot2, you’d run install.packages(“ggplot2”)

List of libraries used in the hands-on session
Session 1 : Intrusion detection

  1. ggplot2
  2. randomForest
  3. caret
  4. rpart
  5. plyr
  6. gbm
  7. rpart.plot
  8. reshape2
  9. naivebayes
  10. corrplot
  11. e1071

Session 2 : News articles recommender

  1. tm
  2. topicmodels
  3. lda
  4. MASS
  5. devtools
  6. NLP
  7. R.utils
  8. stringdist
  9. dplyr
  10. openNLP
  11. rjava
  12. NLP
  13. openNLP
  14. RWeka
  15. qdap
  16. magrittr
  17. openNLPmodels.en
  18. data.table
  19. text2vec

Note If any issues with Rjava, make sure you have JDK and JRE installed on your system.

For Windows: http://docs.oracle.com/javase/7/docs/webnotes/install/windows/jdk-installation-windows.html

For Linux: https://github.com/hannarud/r-best-practices/wiki/Installing-RJava-(Ubuntu)

If you are not able to setup your machine, please send an email to [email protected]

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.