Giter Site home page Giter Site logo

tom-mumby / language-model-text-generator Goto Github PK

View Code? Open in Web Editor NEW
4.0 2.0 0.0 278 KB

A python project which, when given a file containing some text (e.g. works of Shakespeare), aims to generate text in a similar style. It relies on the Hugging Face Transformers to finetune the GPT-2 124M parameter model or train a model from scratch using GPT-2 architecture.

License: Apache License 2.0

Python 100.00%
ai-text ai-text-generator gpt-2 language-model text-file text-generation

language-model-text-generator's Introduction

Title

A python project which, when given a file containing some text (e.g. works of Shakespeare), aims to generate text in a similar style. It relies on the Hugging Face Transformers Library to finetune the GPT-2 124M parameter model or train a model from scratch using GPT-2 architecture and byte-level BPE tokenization. Text is generated as a model is trained and graphs of the validation loss and accuracy at each epoch are produced.

Prerequisites

To run on an Nvidia GPU the requisite drivers must be installed, a guide to do this on Ubuntu systems can be found here.

It is required to install the Hugging Face: transformers, datasets, and evaluate libraries. This can be achieved using pip install followed by the package name.

Usage

Before running LM_text_generator.py set the location of your text file as well as the output directory for the model in config.py. In addition, choose if the text file is to be used to fine-tune or train a model from scratch.

If fine-tuning for the first time, the GPT-2 124M parameter model will the downloaded and cached using the Hugging Face Transformers Library.

The text file will then be split into a training set and a validation set, which can be configured in preprocess.py.

The number of epochs and other training arguments can be altered in train.py and the Learning Rate Scheduler can be changed from the constant rate it is currently set to.

Text outputted from the model can be configured in generate.py. This includes: changing the text length, number of sequences, and adding a prompt to begin the text. Text can be generated after the model has been trained by directly running generate.py.

Output

As the training runs, at each epoch the validation loss and accuracy is calculated. These are presented in a graph along with how the training step size changes. In order to smooth the training loss a Savitzky–Golay filter is applied.

image

Above is an example of a graph produced, in this case fine-tuning shakespeare. The best accuracy is found after 4 epochs, this is confirmed by the values shown. Below is an example of some generated text.

image

Acknowledgments

Parts of the train.py and preprocess.py script are drawn from the run_clm.py script provided as an example from Hugging Face. Where this is true is show in the comments.

A tutorial on transformers and language modelling can be found here. Using an obtained model to generate text is explained by Patrick von Platen in this tutorial. Finally, the Hundred-Page Machine Learning Book by Andriy Burkov is very helpful.

Licence

Apache License - Version 2.0

language-model-text-generator's People

Contributors

tom-mumby avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.