Giter Site home page Giter Site logo

gotgeneration's Introduction

GoTGeneration

Project to generate new Game of Thrones book using LSTM neural networks using the existing 5 books as source material.

Dependencies

This project was tested with Python 2.7.11, Keras, and Theano on windows 10. Training takes ~40 minutes per epoch with a GTX970, CUDA, CuDNNv5 and CNMEM. Follow the instructions posted here to get a local deep learning environment working in windows:

  1. http://ankivil.com/installing-keras-theano-and-dependencies-on-windows-10/
  2. http://ankivil.com/making-theano-faster-with-cudnn-and-cnmem-on-windows-10/

The Data

I'm using the first 5 Game of thrones books concatenated into one large corpus. Unfortunately, I doubt I'm allowed to host the raw text (as unpleasant as it would be to read in that format) on github. In spite of this, the books are out there. If you own them already, try googling Game of Thrones.txt. The model will work with any large corpus of text (>100k Characters), not just game of thrones books.

The versions of 4th and 5th books I found had ascii encoding issues, which is the reason for some of the pre-processing code.

The Model

This is a Character level language model inspired by Andrej Kaparthy's cs231 lectures and The Unreasonable Effectiveness of Recurrent Neural Networks blog post (http://karpathy.github.io/2015/05/21/rnn-effectiveness/).

Getting started

  • all configurations should be set in config.py
  • there is sample output from a model trained for only 3 epochs with a loss of 1.21 in generated/samplegot.txt
  1. place your raw text files in data/raw then run python preprocess.py (you only have to do this when you add new data).
  2. run python train.py to train your model.
  3. once you have a trained model run python generate.py to generate text with the model you specified in config.py (generating long sequences can take a long time, since the model has to predict each character)

Where to go from here

  • preprocess the text in different ways (leave newline characters and capital letters)

  • Generate a "JON" chapter by combining and training with only prior JON chapters. Game of Thrones chapters start with the name of the character who's POV that chapter follows in all caps (for example, "JON", "ARYA").

  • Train a model to speak like specific characters (Tyrion) by scraping only their dialogue from the books or books + show scripts.

  • generate sample hybrid GoT chapter from prior books and other sci-fantasy text e.g. lord of the rings.

  • replace proper nouns in other corpus with proper nouns in got and use new corpus to generate text (Joffrey as Gollum?).

  • Fun NN stuff! try other models such as GRU's or play with regularization, optimizers, add new layers, or change the dimensionality of the hidden layers.

gotgeneration's People

Contributors

a-jacobson avatar damok6 avatar

Stargazers

 avatar

Watchers

 avatar  avatar

Forkers

damok6 tony-blake

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.