Giter Site home page Giter Site logo

somber's Introduction

SOMBER

somber (Somber Organizes Maps By Enabling Recurrence) is a collection of numpy/python implementations of various kinds of Self-Organizing Maps (SOMS), with a focus on SOMs for sequence data.

To the best of my knowledge, the sequential SOM algorithms implemented in this package haven't been open-sourced yet. If you do find examples, please let me know, so I can compare and link to them.

The package currently contains implementations of:

Because these various sequential SOMs rely on internal dynamics for convergence, i.e. they do not fixate on some external label like a regular Recurrent Neural Network, processing in a sequential SOM is currently strictly online. This means that every example is processed separately, and weight updates happen after every example. Research into the development of batching and/or multi-threading is currently underway.

If you need a fast regular SOM, check out SOMPY, which is a direct port of the MATLAB Som toolbox.

Usage

Care has been taken to make SOMBER easy to use, and function like a drop-in replacement for sklearn-like systems. The non-recurrent SOMs take as input [M * N] arrays, where M is the number of samples and N is the number of features. The recurrent SOMs take as input [M * S * N] arrays, where M is the number of sequences, S is the number of items per sequence, and N is the number of features.

Examples

Colors

Color clustering is a kind of Hello, World for Soms, because it nicely demonstrates how SOMs create a continuous mapping. The color dataset comes from this nice blog

import numpy as np

from somber import Som

X = np.array([[0., 0., 0.],
              [0., 0., 1.],
              [0., 0., 0.5],
              [0.125, 0.529, 1.0],
              [0.33, 0.4, 0.67],
              [0.6, 0.5, 1.0],
              [0., 1., 0.],
              [1., 0., 0.],
              [0., 1., 1.],
              [1., 0., 1.],
              [1., 1., 0.],
              [1., 1., 1.],
              [.33, .33, .33],
              [.5, .5, .5],
              [.66, .66, .66]])

color_names = ['black', 'blue', 'darkblue', 'skyblue',
               'greyblue', 'lilac', 'green', 'red',
               'cyan', 'violet', 'yellow', 'white',
               'darkgrey', 'mediumgrey', 'lightgrey']

# initialize
s = Som((10, 10), learning_rate=0.3)

# train
# 10 updates with 10 epochs = 100 updates to the parameters.
s.fit(X, num_epochs=10, updates_epoch=10)

# predict: get the index of each best matching unit.
predictions = s.predict(X)
# quantization error: how well do the best matching units fit?
quantization_error = s.quantization_error(X)
# inversion: associate each node with the exemplar that fits best.
inverted = s.invert_projection(X, color_names)
# Mapping: get weights, mapped to the grid points of the SOM
mapped = s.map_weights()

import matplotlib.pyplot as plt

plt.imshow(mapped)

Sequences

In this example, we will show that the RecursiveSOM is able to memorize short sequences which are generated by a markov chain. We will also demonstrate that the RecursiveSOM can generate sequences which are consistent with the sequences on which it has been trained.

import numpy as np

from somber import RecursiveSom
from string import ascii_lowercase

# Dumb sequence generator.
def seq_gen(num_to_gen, probas):

    symbols = ascii_lowercase[:probas.shape[0]]
    identities = np.eye(probas.shape[0])
    seq = []
    ids = []
    r = 0
    choices = np.arange(probas.shape[0])
    for x in range(num_to_gen):
        r = np.random.choice(choices, p=probas[r])
        ids.append(symbols[r])
        seq.append(identities[r])

    return np.array(seq), ids

# Transfer probabilities.
# after an A, we have a 50% chance of B or C
# after B, we have a 100% chance of A
# after C, we have a 50% chance of B or C
# therefore, we will never expect sequential A or B, but we do expect
# sequential C.
probas = np.array(((0.0, 0.5, 0.5),
                   (1.0, 0.0, 0.0),
                   (0.0, 0.5, 0.5)))

X, ids = seq_gen(10000, probas)

# initialize
# alpha = contribution of non-recurrent part to the activation.
# beta = contribution of recurrent part to activation.
# higher alpha to beta ratio
s = RecursiveSom((10, 10),
                 learning_rate=0.3,
                 alpha=1.2,
                 beta=.9)

# train
# show a progressbar.
s.fit(X, num_epochs=100, updates_epoch=10, show_progressbar=True)

# predict: get the index of each best matching unit.
predictions = s.predict(X)
# quantization error: how well do the best matching units fit?
quantization_error = s.quantization_error(X)

# inversion: associate each node with the exemplar that fits best.
inverted = s.invert_projection(X, ids)

# find which sequences are mapped to which neuron.
receptive_field = s.receptive_field(X, ids)

# generate some data by starting from some position.
# the position can be anything, but must have a dimensionality
# equal to the number of weights.
starting_pos = np.ones(s.num_neurons)
generated_indices = s.generate(50, starting_pos)

# turn the generated indices into a sequence of symbols.
generated_seq = inverted[generated_indices]

TODO

See issues for TODOs/enhancements. If you use SOMBER, feel free to send me suggestions!

Contributors

  • Stéphan Tulkens

LICENSE

MIT

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.