Giter Site home page Giter Site logo

training-charrnn's Introduction

Training a charRNN and using the model in ml5js

Multi-layer Recurrent Neural Networks (LSTM, RNN) for character-level language models in Python using Tensorflow and modified to work with tensorflow.js and ml5js

Based on char-rnn-tensorflow.

Requirements

Usage

Collect data

RNNs work well when you want predict sequences or patterns from your inputs. Try to gather as much input text data as you can. The more the better. Compile all of the text data into a single text file and make note of where the file is stored (path) on your computer.

(A quick tip to concatenate many small disparate .txt files into one large training file: ls *.txt | xargs -L 1 cat >> input.txt)

Set-up Python Environment

This first step of using a python "virtual environment" (venv video tutorial) is recommended but not required.

$ python3 -m venv your_venv_name
$ source your_venv_name/bin/activate

Train model

Note you can also download this repo as an alternative to git clone.

$ git clone https://github.com/ml5js/training-charRNN
$ cd training-charRNN
$ pip install -r requirements.txt
$ python train.py --data_path /path/to/data/file.txt

Optionally, you can specify the hyperparameters you want depending on the training set, size of your data, etc:

python train.py --data_path=./data \
--rnn_size 128 \
--num_layers 2 \
--seq_length 50 \
--batch_size 50 \
--num_epochs 50 \
--save_checkpoints ./checkpoints \
--save_model ./models

When training is complete a JavaScript version of your model will be available in a folder called ./models (unless you specify a different path.)

Once the model is ready, you'll just need to point to it in your ml5 sketch, for more visit the charRNN() documentation.

const charRNN = new ml5.charRNN('./models/your_new_model');

That's it!

Hyperparameters

Given the size of the training dataset, here are some hyperparameters that might work:

  • 2 MB:
    • rnn_size 256 (or 128)
    • num_layers 2
    • seq_length 64
    • batch_size 32
    • output_keep_prob 0.75
  • 5-8 MB:
    • rnn_size 512
    • num_layers 2 (or 3)
    • seq_length 128
    • batch_size 64
    • dropout 0.25
  • 10-20 MB:
    • rnn_size 1024
    • num_layers 2 (or 3)
    • seq_length 128 (or 256)
    • batch_size 128
    • output_keep_prob 0.75
  • 25+ MB:
    • rnn_size 2048
    • num_layers 2 (or 3)
    • seq_length 256 (or 128)
    • batch_size 128
    • output_keep_prob 0.75

Note: output_keep_prob 0.75 is equivalent to dropout probability of 0.25.

Additional resources

training-charrnn's People

Contributors

cvalenzuela avatar dependabot[bot] avatar peilingjiang avatar rafajak avatar shiffman avatar yining1023 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.