Giter Site home page Giter Site logo

funktional's Introduction

funktional

A minimalistic toolkit for functionally composable neural network layers with Theano.

Installation

If you'll be making changes to the code it's best to install in the development mode, so all the changes are used by the Python interpreter. From the directory of the repo run the following command:

> python setup.py develop --user

Rationale

Conceptually, neural network layers are functions on tensors. In funktional, layers are implemented as callable Python objects which means that they can be easily mixed and matched with regular Python functions operating on Theano tensors. Function-like layers are also easy to compose into more complex networks using function application and function composition idioms familiar from general purpose programming.

For example, suppose we need to implement a recurrent encoder-decoder network. As the encoder we use a Gated Recurrent Unit (GRU) layer which takes one argument (the input sequence) and returns a sequence of hidden states. The decoder is another GRU which takes two arguments (an initial state, and a sequence of outputs), and returns a sequence of hidden states. From these hidden states we then predict the next output element:

                y1  y2  y3  y.
                ^   ^   ^   ^
                |   |   |   |            
h0->h1->h2->h3->g1->g2->g3->g4
    ^   ^   ^   ^   ^   ^   ^
    |   |   |   |   |   |   |
    x1  x2  x.  y0  y1  y2  y3

With funktional, you would create the layers and then compose them using function application like so:

def last(x):
    """Returns the last time step of all sequences in x."""
    return x.dimshuffle((1,0,2))[-1]

class EncoderDecoder(Layer):
    def __init__(self, size_in, size, size_out):
        self.Encode = GRUH0(size_in=size_in, size=size)
        self.Decode = GRU(size_in=size_out, size=size)
        self.Output = Dense(size_in=size, size=size_out)
        self.params = self.Encode.params + self.Decode.params + self.Output.params

    def __call__(self, inp, out_prev):
        return self.Output(self.Decode(last(self.Encode(inp)), out_prev))

Note that in the definition of __call__ we specify the network connectivity using function application syntax, and mix Layer objects together with regular Python functions such as last. The EncoderDecoder we defined can in turn be composed with other layers or functions:

Encdec = EncoderDecoder(size_in, size, size_out)
output = softmax3d(Encdec(input, output_prev))

See layer.py for more examples of layer compositions.

See reimaginet for examples of models defined using funktional.

funktional's People

Contributors

gchrupala avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

Forkers

imclab kadarakos

funktional's Issues

Get rid of negative activation with relu GRUs

The negative values are being propagated from the initial state via
the linear interpolation formula of the GRU.
The initial state is initialized to all zeros, but these parameters
are also learnable so they can become negative.

It's confusing to have these negative numbers pop up. The initial state could be passed through the layer activation also, but if we initialize to 0 and use a clipped_relu, they become all dead.

Referring by name

I believe it would be a good idea to make it possible to refer by name to:

  • Network parameters: for regularization or just analysis
  • Return values: RNNs return "hidden states", "r states", "z states" and who knows what in the future

This would make experimentation easier. Functions like "last" or anything that has to do with prediction could query for "hidden states", functions related to regularizers could query for "weights" or for "activation vectors" and so on. I think this is would be nicer than say I want regularize(params[10]) or last(output[0]).

Implement Net2DeeperNet for GRU layers

Idea based on "Net2Net: Accelerating Learning via Knowledge Transfer" http://arxiv.org/abs/1511.05641
The current grow function should initialize the newly added layer to one implementing an identity function. For a GRU, assuming non-negative inputs, and a relu or clipped relu activation, and the following definiton of the layer:

def GRU(W,U,Wz,Uz,Wr,Ur,xt,htm1):
    r = sigmoid(dot(xt,Wr)+dot(htm1, Ur))
    z = sigmoid(dot(xt,Wz)+dot(htm1, Uz))
    htilde = rectify(dot(xt,W)+dot(r*htm1, U))
    h = (1-z) * htm1 + z*htilde
    return h

we could set:

  • W - Identity
  • U - Zero
  • Wz - Identity + 2 (or any number ensuring z is close to 1)
  • Uz - Identity + 2
  • Wr - random
  • Ur - random

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.