Exercises inspired by Natural Language Processing with Transformers, by Tunstall, von Werra and Wolf. Also see their official notebook repository.
Contents:
This is the fine-tuning section of Chapter 2 - Text Classification. I found the text notebook wouldn't run in Kaggle on a P100, so this is a cut down version that trains the fine-tuned classifier which will run in Kaggle.
To run in Kaggle:
- Click the Open in Kaggle link:
- In the notebook settings select "Accelerator > GPU", as per screenshot below
- Run the cells of the notebook
This is a variation of classifying on the IMDB dataset from Learning Word Vectors for Sentiment Analysis, for which there are good benchmarks.
This is for recognising ingredients from recipes, using the data from A Named Entity Based Approach to Model Recipes, by Diwan, Batra, and Bagler.
Recipe NER with Stanford NLP reproduces the results in the original paper using Stanford NLP, showing that seqeval gives the same results. This does not use transformers, nor require a GPU.
Recipe NER with XLM Roberta follows the NLP with Transformers text, training a XLM Roberta model but on this recipe dataset. It performs essentially similarly to the Stanford NLP CRF model, but makes it easy to see annotation issues and has some zero-shot cross-language NER.