uclnlp / stat-nlp-book Goto Github PK

Interactive Lecture Notes, Slides and Exercises for Statistical NLP

Home Page: http://uclmr.github.io/stat-nlp-book

Jupyter Notebook 55.31% Python 0.56% Shell 0.01% CSS 0.01% TeX 0.01% HTML 42.76% JavaScript 1.35% Dockerfile 0.01%

stat-nlp-book's Introduction

The Stat-NLP-Book Project

Render Book Statically

The easiest option for reading the book is via the static nbviewer. While this does not allow you to change and execute code, it also doesn't require you to install software locally and only needs a browser.

Docker installation

We assume you have a command line interface (CLI) in your OS (bash, zsh, cygwin, git-bash, power-shell etc.). We assume this CLI sets the variable $(pwd) to the current directory. If it doesn't replace all mentions of $(pwd) with the current directory you are in.

When using Windows PowerShell all instances of "$(pwd)" should be replaced with ${PWD}.

Install Docker

Go to the docker webpage and follow the instruction for your platform.

Download Stat-NLP-Book Image

Next you can download the stat-nlp-book docker image like so:

docker pull riedelcastro/stat-nlp-book

Get Stat-NLP-Book Repository

You can use the git installation in the docker container to get the repository:

docker run -v "$(pwd)":/home/jovyan/work riedelcastro/stat-nlp-book git clone https://github.com/uclmr/stat-nlp-book.git

Note: this will create a new stat-nlp-book directory in your current directory.

Change into Stat-NLP-Book directory

We assume from here on that you are in the top level stat-nlp-book directory:

cd stat-nlp-book

Note: you need to be in the stat-nlp-book directory every time you want to run/update the book.

Run Notebook

docker run -it --rm -p 8888:8888 -v "$(pwd)":/home/jovyan/work riedelcastro/stat-nlp-book

You are now ready to visit the overview page of the installed book.

Usage

Once installed you can always run your notebook server by first changing into your local stat-nlp-book directory, and then executing:

docker run -it --rm -p 8888:8888 -v "$(pwd)":/home/jovyan/work riedelcastro/stat-nlp-book

This is assuming that your docker daemon is running and that you are in the stat-nlp-book directory. How to run the docker daemon depends on your system.

Update the notebook

We frequently make changes to the book. To get these changes you should first make sure to clean your local changes to avoid merge conflicts. That is, you might have made changes (by changing the code or simply running it) to the files that we changed. In these cases git will complain when you do the update. To overcome this you can undo all your changes by executing:

docker run -v "$(pwd)":/home/jovyan/work riedelcastro/stat-nlp-book git checkout -- .

If you want to keep your changes create copies of the changed files. Jupyter has a "Make a copy" option in the "File" menu for this. You can also create a clone of this repository to keep your own changes and merge our changes in a more controlled manner.

To get the actual updates then run

docker run -v "$(pwd)":/home/jovyan/work riedelcastro/stat-nlp-book git pull

Access Content

The repository contains a lot of material, some of which may not be ready for consumption yet. This is why you should always access content through the top-level overview page (local-link).

virtualenv installation [BETA]

Install virtualenv

Follow the instructions here In short:

pip3 install virtualenv

git clone the stat-nlp-book repository

git clone https://github.com/uclmr/stat-nlp-book.git

Create virtual environment

Enter the cloned stat-nlp-book directory:

cd stat-nlp-book

and create the virtual environment:

virtualenv -p python3 venv

Enter the virtual environment

source venv/bin/activate

Install dependencies

pip3 install --upgrade pip
pip3 install -r requirements.txt
pip3 install git+git://github.com/robjstan/tikzmagic.git
jupyter-nbextension install rise --py --sys-prefix
jupyter-nbextension enable rise --py --sys-prefix

Run the notebook

jupyter notebook

Installation on the UCL CS cluster

Install virtualenv

When installing virtualenv (full instructions here here) on the CS cluster you will likely have to install it with the --user flag. In short:

pip3 install --user virtualenv

At this point virtualenv may not yet directly be found. You can solve this by finding its location via

pip3 show virtualenv

then appending the LOCATION shown (a directory name) to your $PATH variable using

export PATH=$PATH:LOCATION

and giving permission to execute via

chmod u=rwx LOCATION/virtualenv.py

You should then be able to run virtualenv.py. You can check this by running

virtualenv.py --version

git clone the stat-nlp-book repository

Now we're ready to clone the notebook:

git clone https://github.com/uclmr/stat-nlp-book.git

Create virtual environment

Enter the cloned stat-nlp-book directory via

cd stat-nlp-book

and create the virtual environment:

virtualenv.py -p python3 venv

Enter the virtual environment

source venv/bin/activate

Install dependencies

pip3 install --upgrade pip
pip3 install -r requirements.txt
pip3 install git+git://github.com/robjstan/tikzmagic.git
jupyter-nbextension install rise --py --sys-prefix
jupyter-nbextension enable rise --py --sys-prefix

Run the notebook

jupyter notebook

Access in local browser

With the notebook running on the UCL CS cluster, you can also access it locally via first setting up an SSH tunnel

# run this on your local machine
ssh -N -f -L localhost:8157:localhost:8888 username@cs_cluster

and accessing it through your local browser by entering

localhost:8157

into the browser address bar.

stat-nlp-book's People

Contributors

Stargazers

Watchers

Forkers

hermioneyu andreas-koukorinis kloudy13 jhwjhw0123 z8liang allensmile benjamesbabala fancyerii chagge devsinghsachan wenmin-wu mlmisty robmsmt mehdiait horngep valdersoul iustinsibiescu johannesmaxwel ahoho mindis xkuang youryuan enggen npounder pminervini superching macintoshxz heartburing denielcs13 asfakianakis shubhampachori12110095 afcarl mehdimashayekhi baifengbai lukemshannonhill luisizq shuhaozhang95 stenpiren ryanjenkinson noisyoscillator xuanzengli dhammo2 millerdw zhimengchi hhy5277 fipelle fatmas1982 ruiatelsevier dansanz chloenhy intibeer lujiammy eugeniorj keshava stevieg3 sjyttkl biligee juri-marcucci xuzixuan1998 ulzhanai

stat-nlp-book's Issues

when using mpld3, make sure that d3 libraries can be loaded offline

Right now it seems to require a connection,

Fix "nbformat.current is deprecated" warning

Caused by util.execute_notebook

Add "theory" notebooks that explain and derive concepts a little more generally

MLE
EM Algorithm

Fix relation extraction material

Some code missing, update the dataset if required.

Introduce proper Tree class for parsing instead of using tuples

Unable to sample from interpolated N-gram models.

When trying to sample from an interpolated N-gram model, there is an error saying that the probabilities do not sum to one. This is despite the fact that the normalisation tests sum to 1 and the model gives a valid perplexity. Here's a simple model that demonstrates this issue.

get mathjax latex references to equations to work in word_mt notebook

Student Exercise Environment

When the students develops own code (say, for an exercise or the assignment), they will have several options:

clone the repo, setup the virtual env, develop in IDE
load a docker container with the notebook, and then edit everything within the notebook (and hence within the container)
load docker, edit files within the docker container (using command line editors)
load the docker container, mount a local directory with their code (and/or our code), and edit locally but run in the docker container
etc.

We should decide on a preferred mode, and then only support that mode.

Find a way to share latex macros across notebooks

Currently they are repeated in each notebook, defined in a markdown cell.

Problem of update.

Dear developer,

I meet a problem of update, could you helps me figure it out. basically i am presently a student of MSc Web science and Big data. I have attached the error message.
Thank you so much

yaowangyideMacBook-Pro:stat-nlp-book yaowangyi$ docker run -v $PWD:/home/jovyan/work riedelcastro/stat-nlp-book git pull
error: The following untracked working tree files would be overwritten by merge:
data/ohhla/dev/www.ohhla.com/YFA_common.html
data/ohhla/dev/www.ohhla.com/anonymous/b_rhymes/backonmy/decision.brm.txt.html
data/ohhla/dev/www.ohhla.com/anonymous/beatnuts/massacre/slam_pit.btn.txt.html
data/ohhla/dev/www.ohhla.com/anonymous/big_sean/detroit/common.bsn.txt.html
data/ohhla/dev/www.ohhla.com/anonymous/big_sean/hallfame/switchup.bsn.txt.html
data/ohhla/dev/www.ohhla.com/anonymous/blackstr/blackstr/respire.blk.txt.html
data/ohhla/dev/www.ohhla.com/anonymous/brnubian/found/maybeone.brn.txt.html
data/ohhla/dev/www.ohhla.com/anonymous/common/be/be.cms.txt.html
data/ohhla/dev/www.ohhla.com/anonymous/common/be/chi_city.cms.txt.html
data/ohhla/dev/www.ohhla.com/anonymous/common/be/corner.cms.txt.html
data/ohhla/dev/www.ohhla.com/anonymous/common/be/faithful.cms.txt.html
data/ohhla/dev/www.ohhla.com/anonymous/common/be/foodlive.cms.txt.html
data/ohhla/dev/www.ohhla.com/anonymous/common/be/go.cms.txt.html
data/ohhla/dev/www.ohhla.com/anonymous/common/be/its_your.cms.txt.html
data/ohhla/dev/www.ohhla.com/anonymous/common/be/love_is.cms.txt.html
data/ohhla/dev/www.ohhla.com/anonymous/common/be/r_people.cms.txt.html
data/ohhla/dev/www.ohhla.com/anonymous/common/be/testify.cms.txt.html
data/ohhla/dev/www.ohhla.com/anonymous/common/be/they_say.cms.txt.html
data/ohhla/dev/www.ohhla.com/anonymous/common/circus/aquarius.cms.txt.html
data/ohhla/dev/www.ohhla.com/anonymous/common/circus/between.cms.txt.html
data/ohhla/dev/www.ohhla.com/anonymous/common/circus/close2me.cms.txt.html
data/ohhla/dev/www.ohhla.com/anonymous/common/circus/electric.cms.txt.html
data/ohhla/dev/www.ohhla.com/anonymous/common/circus/gotright.cms.txt.html
data/ohhla/dev/www.ohhla.com/anonymous/common/circus/heaven.cms.txt.html
data/ohhla/dev/www.ohhla.com/anonymous/common/circus/hustle.cms.txt.html
data/ohhla/dev/www.ohhla.com/anonymous/common/circus/iammusic.cms.txt.html
data/ohhla/dev/www.ohhla.com/anonymous/common/circus/new_wave.cms.txt.html
data/ohhla/dev/www.ohhla.com/anonymous/common/circus/sl_power.cms.txt.html
data/ohhla/dev/www.ohhla.com/anonymous/common/circus/star_69.cms.txt.html
data/ohhla/dev/www.ohhla.com/anonymous/common/dollar/a_penny.cms.txt.html
data/ohhla/dev/www.ohhla.com/anonymous/common/dollar/blows_to.cms.txt.html
data/ohhla/dev/www.ohhla.com/anonymous/common/dollar/breaker.cms.txt.html
data/ohhla/dev/www.ohhla.com/anonymous/common/dollar/by_pound.cms.txt.html
data/ohhla/dev/www.ohhla.com/anonymous/common/dollar/charms.cms.txt.html
data/ohhla/dev/www.ohhla.com/anonymous/common/dollar/heidihoe.cms.txt.html
data/ohhla/dev/www.ohhla.com/anonymous/common/dollar/justnick.cms.txt.html
data/ohhla/dev/www.ohhla.com/anonymous/common/dollar/pitchin.cms.txt.html
data/ohhla/dev/www.ohhla.com/anonymous/common/dollar/puppy.cms.txt.html
data/ohhla/dev/www.ohhla.com/anonymous/common/dollar/takeitez.cms.txt.html
data/ohhla/dev/www.ohhla.com/anonymous/common/dollar/tricks.cms.txt.html
data/ohhla/dev/www.ohhla.com/anonymous/common/dollar/twoscoop.cms.txt.html
data/ohhla/dev/www.ohhla.com/anonymous/common/dreamer/believer.cms.txt.html
data/ohhla/dev/www.ohhla.com/anonymous/common/dreamer/blue_sky.cms.txt.html
data/ohhla/dev/www.ohhla.com/anonymous/common/dreamer/celebr8.cms.txt.html
data/ohhla/dev/www.ohhla.com/anonymous/common/dreamer/cloth.cms.txt.html
data/ohhla/dev/www.ohhla.com/anonymous/common/dreamer/dreamer.cms.txt.html
data/ohhla/dev/www.ohhla.com/anonymous/common/dreamer/g_dreams.cms.txt.html
data/ohhla/dev/www.ohhla.com/anonymous/common/dreamer/gd_remix.cms.txt.html
data/ohhla/dev/www.ohhla.com/anonymous/common/dreamer/gold.cms.txt.html
data/ohhla/dev/www.ohhla.com/anonymous/common/dreamer/lovinlst.cms.txt.html
data/ohhla/dev/www.ohhla.com/anonymous/common/dreamer/pops_be.cms.txt.html
data/ohhla/dev/www.ohhla.com/anonymous/common/dreamer/raw_howu.cms.txt.html
data/ohhla/dev/www.ohhla.com/anonymous/common/dreamer/sweet.cms.txt.html
data/ohhla/dev/www.ohhla.com/anonymous/common/dreamer/thebermx.cms.txt.html
data/
Aborting
Updating 8993d56..c52edc8

uclnlp / stat-nlp-book Goto Github PK

stat-nlp-book's Introduction

The Stat-NLP-Book Project

Render Book Statically

Docker installation

Install Docker

Download Stat-NLP-Book Image

Get Stat-NLP-Book Repository

Change into Stat-NLP-Book directory

Run Notebook

Usage

Update the notebook

Access Content

virtualenv installation [BETA]

Install virtualenv

git clone the stat-nlp-book repository

Create virtual environment

Enter the virtual environment

Install dependencies

Run the notebook

Installation on the UCL CS cluster

Install virtualenv

git clone the stat-nlp-book repository

Create virtual environment

Enter the virtual environment

Install dependencies

Run the notebook

Access in local browser

stat-nlp-book's People

Contributors

Stargazers

Watchers

Forkers

stat-nlp-book's Issues

Recommend Projects

Recommend Topics

Recommend Org