Run pretrained ELMo model to get single sentence perplexity.
Modified from Allenai bilm-tf
pip install tensorflow-gpu==1.2 h5py
python setup.py install
-
Data file format: each line in a file is a sentence to calculate perplexity. demo
-
Split data file into pieces, one sentence per piece.
split sents.txt -d -l 1 -a 4 cs
- Run the evaluation script.
sh evaluate.sh
- The perplexity score is shown in stdout
...
5946: 129.57085
5947: 1412.2032
5948: 5172.711
5949: 2126.5542
...
Sentence line number followed by the perplxity (unnormalized by sentence length)
To finetune the ELMo on additional corpus, you can use the following script.
sh finetune.sh
After finetuning the model, you can run the evaluation again to see the finetune effect.