Giter Site home page Giter Site logo

pytorch_bits's Introduction

pytorch-bits

Experiments for fun and education. Mostly concerning time-series prediction.

I started my experiments with @osm3000's sequence_generation_pytorch repo and some of that code still subsists in these files.

How to run these experiments

  1. clone/download this repo
  2. pip install -r requirements.txt
  3. python experiment.py [ARGS]

Possible arguments include...

  • --data_fn FN where FN is one of the data generation functions listed below
  • --add_noise to add noise the generated waveform
  • --length TIMESERIES_LENGTH
  • --batch_size BATCH_SIZE
  • --seq_len SEQ_LEN the subsequence length used in training
  • --epochs MAX_EPOCHS
  • --lr LR
  • --layers LAYERTYPE_SIZE [LAYERTYPE_SIZE ...] see the section on Model generation
  • --sigmoid REPLACEMENT to use an alternative to sigmoid, this can be of the activations mentioned below, e.g. ISRU_sigmoid, or any function from torch.nn.functional
  • --tanh REPLACEMENT to use an alternative to tanh, this must be one of the activations mentioned below, e.g. ISRU_tanh, or any function from torch.nn.functional
  • --warmup WARMUP do not use the loss from the first WARMUP elements of the series in order to let the hidden state warm up.
  • --verbose

Data generation

  • sine_1 generates a sine wave of wavelength 60 steps
  • sine_2 overlays sine_1 with a sine wave of wavelength 120 steps
  • sine_3 overlays sine_2 with a sine wave of wavelength 180 steps
  • mackey_glass generates a Mackey-Glass chaotic timeseries using the signalz library
  • levy_flight generates a Lévy flight process using the signalz library
  • brownian generates a Brownian random walk using the signalz library

The generator produces a tensor of shape (length, batches, 1) containing batches independantly generated series of the required length.

Model generation

The --layers argument takes a simplistic model specification.

For example: --layers LSTM_50 LSTM_60 GRU_70 specifies a three layer network with 50 LSTM units in the first layer, 60 LSTM units in the second layer and 70 GRU units in the third layer.

If the output of the last requested layer doesn't match the number of target values (for these experiments the target size is 1) then the script adds a Linear layer to produce the required number of output values.

Layers

All of these recurrent layers keep track of their own hidden state (if needed, the hidden state is accessible via the hidden attribute). They all have methods to reset_hidden() and to detach_hidden().

reset_hidden() should be used before feeding the model the start of a new sequence, and detach_hidden() can be called in-between batches of the same set of sequences in order to truncate backpropagation through time and thus avoid the slowdown of having to backpropagate through to the beginning of the entire sequence.

Moreover they all take input of shape (seq_len, batch_size, features). This allows vectorising any calculations that don't depend on the hidden state.

Planned

Ideas/research

  • I plan to study arxiv:Unbiased Online Recurrent Optimization, but for the moment it is not clear to me how best to implement it.
  • Optional noisy initial hidden states. Otherwise the model will learn to cope with the fact of having zero initial hidden state which may hinder learning the hidden state dynamics later in the sequences. This probably isn't very important if I have only a few sequences that are very long and that are normalised to zero mean.
  • The LSTM class in PyTorch builds a configurable number of identically sized LSTM layers. This architecture allows us to calculate W x h_tm1 for all layers in one single operation. I may try adapting the above layers to take advantage of this.

Optimisers

Planned

Activations

Regularisers

Planned

pytorch_bits's People

Contributors

jpeg729 avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.