Giter Site home page Giter Site logo

tm_bootcamp's Introduction

title: Text Mining Bootcamp
place: 12.0.26 på KU Sønder Campus, Danmark
time: August 14-18/2017,  9 AM to 3 PM.
instructors: Peter Leonard (Yale University Library) & Kristoffer L. Nielbo (Interacting Minds Centre)
contact: [email protected]

Preparation

  1. Install the Anaconda distribution of Python for your OS
  2. Read chapters 1-6 of Automate the Boring Stuff with Python

Literature

  • Sweigart, A. (2015). Automate the Boring Stuff with Python: Practical Programming for Total Beginners. San Francisco: No Starch Press.

Schedule

DAY 1: Programming with Python

Time Content Instructor
09:00-09:30 Welcome & Setup KLN
09:30-10:30 Text Analytics KLN
10:30-11:00 Analyzing Tabular Data KLN
11:00-11:30 Repeating Actions with Loops KLN
11:30-12:00 Storing Multiple Values in Lists KLN
12:00-13:00 Lunch *
13:00-13:30 Analyzing Data from Multiple Files KLN
13:30-14:00 Making Choices KLN
14:00-14:30 Creating Functions KLN
14:30-15:00 Finish KLN

DAY 2: From Print to Probability

Time Content Instructor
09:00-09:30 Welcome KLN
09:30-10:00 Reading Unstructured Data KLN
10:00-10:30 Cleaning & Segmentation KLN
10:30-11:00 Free Play KLN
11:00-11:30 Language Normalization KLN
11:30-12:00 Term Frequencies KLN
12:00-13:00 Lunch *
13:00-13:30 Dispersion and Distributions KLN
13:30-14:00 Vector Space Representations KLN
14:00-14:30 Project hour KLN
14:30-15:00 Project hour KLN

DAY 3: Time, Density, and Information

Time Content Instructor
09:00-09:30 Welcome KLN
09:30-10:00 Beyond Words KLN
10:00-10:30 Lexical Density KLN
10:30-11:00 Free Play KLN
11:00-11:30 Readability KLN
11:30-12:00 Information KLN
12:00-13:00 Lunch *
13:00-13:30 Sentiment vectors KLN
13:30-14:00 Sentiment vectors KLN
14:00-14:30 Project hour PL & KLN
14:30-15:00 Project hour PL & KLN

DAY 4: Latent Variables and (Multiple) Relations

Time Content Instructor
09:00-09:30 Welcome PL
09:30-10:00 Network Analysis: Introduction PL
10:00-10:30 Network Analysis: Textual/Literary Examples PL
10:30-11:00 Free Play: Brainstorming Network Projects PL
11:00-11:30 Network Analysis: Building a Dataset PL
11:30-12:00 Network Analysis: Tools - Gephi PL
12:00-13:00 Lunch *
13:00-13:30 Topic Modeling PL
13:30-14:00 Topics Modeling Hands-On PL
14:00-14:30 Project hour PL
14:30-15:00 Project hour PL

DAY 5: Classification and Associations
topics: classification, document similarity, and word embedding

Time Content Instructor
09:00-09:30 Statistical learning KLN
09:30-10:00 Classification: Introduction KLN
10:00-10:30 Representation ins
10:30-11:00 Validation KLN
11:00-11:30 Optimization KLN
11:30-12:00 Free Play KLN
12:00-13:00 Lunch *
13:00-13:30 Topic Modeling: Review PL
13:30-14:00 Word Embedding: Demonstrations PL
14:00-14:30 Word Embedding: Hands-On PL
14:30-15:00 Finish *

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.