This Github provides the Jupyter notebooks for the Lab sessions of the VU Language-As-Data course, which is part of the Text Mining master of the Faculty of Humanities.
The course is split into 5 different labs:
- How to get texts from various online sources
- How to process the text using toolkits to get the linguistic properties for data sets of texts
- How to get the entities from texts as content
- How to get the events from texts as content
- How to get the perspectives of sources mentioned in the text on what they claims, state, belef, feel.
There are three assignments where the students need to build up their own data sets of texts from two different sources, process these texts, get the content and get the perspectives. Besides creating the code they need to critically assess the output and the systems that produced the output. They should also assess the difference across the two different sources used to obtain the data.
These notebooks were created by Piek Vossen, Vrije Universiteit Amsterdam. (c)