irllabs / ml-lib Goto Github PK

A machine learning library for Max and Pure Data

License: Other

Python 0.16% C++ 17.80% Max 80.82% Makefile 0.33% C 0.06% Objective-C 0.07% Objective-C++ 0.74%

ml-lib's Introduction

ml-lib

ml-lib is a library of machine learning externals for Max and Pure Data. ml-lib is primarily based on the Gesture Recognition Toolkit by Nick Gillian ml-lib is designed to work on a variety of platforms including OS X, Windows, Linux, on Intel and ARM architectures.

The goal of ml-lib is to provide a simple, consistent interface to a wide range of machine learning techniques in Max and Pure Data. The canonical NIME 2015 paper on ml-lib can be found here.

Full class documentation can be found here.

Bug Reports and Discussion

Please use the GitHub Issue Tracker for all bug reports and feature requests.

Development Status

The library has currently been tested on Mac OS X with Max 7 and 8 and on Mac OS X and Linux on i386 and armv6 architectures using Pure Data.

Bugs should be reported via the issues page.

Installation

For Max, install by searching for ml.lib in the Max Package Manager that can be found under File -> Show Package Manager.
For Pd, install by searching for ml.lib in the Deken externals manager that can be found under Help -> Find Externals. Once installed, add to your Pd search path the exact path to the ml.lib folder inside Pd/externals.

Compiling from source

Instructions for compiling ml-lib from source can be found here

Library structure

ml-lib objects follow the naming convention ml.* where “*” is an abbreviated form of the algorithm implemented by the object.

A full list of all objects and their parameters can be found here.

For more detailed descriptions of the underlying algorithms, see links below.

Objects fall into one of five categories:

Pre-processing: pre-process data prior to used as input to a classification or regression object
Post-processing: post-process data after being output from a classification or regression object
Feature extraction: extract “features” from control data. Feature vectors can be used as input to classification or regression objects
Classification: take feature vectors as input, and output a value representing the class of the input. For example an object detecting hand position might output 0 for left, 1 for right, 2 for top and 3 for bottom.
Regression: perform an M x N mapping between an input vector and an output vector with one or more dimensions. For example an object may map x and y dimensions of hand position to a single dimension representing the distance from origin (0, 0).

Object list

Pre-processing

No objects currently implemented

Post-processing

No objects currently implemented

Feature extraction

ml.minmax: output a vector of minima and maxima locations (peaks) from an input vector

Classification

ml.adaboost: Adaptive Boosting
ml.dtw: Dynamic Time Warping
ml.gmm: Gaussian Mixture Model
ml.hmmc: Hidden Markov Models
ml.knn: k’s Nearest Neighbour
ml.mindist:Minimum Distance
ml.randforest: Random Decision Forest
ml.softmax: Softmax
ml.svm: Support Vector Machines

Regression

ml.linreg: Linear Regression
ml.logreg: Logistic Regression
ml.ann: Multi-layer Perceptron
ml.mulreg: Multiple Regression

See the help file for each component for further details about operation and usage.

Credits

This software has been designed and developed by Ali Momeni and Jamie Bullock. The [Gesture Recognition Toolkit](http://nickgillian.com/grt/index.html is developed by Nick Gillian

ml-lib is supported by Art Fab, the College of Fine Arts and The Frank-Ratchye STUDIO for Creative Inquiry at Carnegie Mellon University

Special thanks to Niccolò Granieri for testing and assistance.

License

ml-lib is distributed under the GNU General Public License version 2. A copy of this is available in the accompanying LICENSE file. See also http://www.gnu.org/licenses/.

ml-lib's People

Contributors

Stargazers

Watchers

ml-lib's Issues

ml.mlp and documentation in general

there are a lot of methods and attributes in this class; i could really use a list/table that shows each attribute and a sentence that describes it; i'd love to move in this direction for the overall documentation also; if you could provide a very basic version of this (it could even be as comments in the code which would make the code more legible) i can work on a standard formatting for html documentaiton, like a javadoc for java classes, or max's built-in documentation, or this github page for maw.lib, another max development project of mine (this one, very comprehensive):

https://github.com/themaw/ofxLivedrawEngine/blob/master/docs/Livedraw_OSC_Command_Set.md

what do you think?

Remove debug messages from console window

The external is now posting a lot of text to the pd-window during prediction; seems like debugging messages. please remove; eg

prob_label 0 1
prob_label 1 2
prob_label 2 3
0.361318
0.218911
0.419771

use cases of DTW

i'd like to make an example that uses the familiar 3-axis accel of a mobile phone to distinguish between gestures.
how might this work? from the help file it seems that DTW works on time based functions of dimensionality ONE;
so in my case, i'd take just one of the accelerometers?

from the examples it also seems that in order to classify, one must have the entire test sequence; so is "on-the-fly" classification of a new sequence (i.e. gesture), as it's being performed, not possible?

ali

Make all console messages report the external they belong to

All calls to error() and post() in the Flex API should prefix their printed strings with the name of the external printing the message

ml-lib and DTW

i looked further into DTW
very simply and elegant
i think we should make this our next goal as it will be fairly fast to push out.

starting with existing code, or one of the optimized C++ libs you mentioned; your call.

what do you think?

ali

Training doesn't work in Max

Training doesn't currently work in Max. Spurious data is printed to the console.

Estimates and prediction are wrong if estimates is set to "1"

It turns out #4 was a red herring in that it masked a deeper problem. Estimates in general seem to be incorrect so if the "estimates" attribute is set to 1, both the estimates and the prediction are wrong.

This isn't an issue with the ml.libsvm wrapper, but rather seems to an issue with the underlying libsvm library.

For example I create a training file with:

1 1 1 1
2 2 2 2
3 3 3 3

When I run the svm-train utility and then svn-predict. it gets 100% prediction accuracy.

However if I set the "probability" flag (-b 1) to include the probability estimates then it gets 0% classification accuracy.

I am therefore writing to the library author about this.

Use "help" instead of "bang" to get help

more ml-lib.... DTW, MLP, ANN and HMMM

A reminder for assessment report (on choice of source library to port, and time required to port it) of future directions for ml-lib, including added features to flext.

ali

Send "loaded" confirmation to right outlet after loading model

"estimates 1" in max

hello,
with the latest release:
https://github.com/jamiebullock/libsvm-flext-wrapper/releases/tag/v0.2.1-alpha

when i send "estimates 1" to the object, i get this error with every "predict" message:

error: probability attribute set to 1, but model doesn't support probability

i'm no able to get the continuous confidence vector out of the right out let.
ideas?

Cross-validation always reports accuracy=0

estimates output from 2nd outlet of PD version

it seems that the list coming out of the 2nd outlet of the PD object is not a "good" list. i can not take off the word "estimates" with the "route" object as i expect.

the updated pd help patch shows this issues.

Make build system generate ZIP file for deployment builds

The build system should make it possible to generate a ZIP file containing the ml.* external + related help files via deployment builds

Change spelling from normalise to normalize

ml.mlp saving training data

i also noticed that "save /filename.txt" didn't work where as "save /tmp/filename.txt" works.

it's also conventional for messages like "save", given with no argument, to open a file save dialog box; if possible to implement that for all objects that save things, it'd be a plus.

ali

ml.svm error messages and error messages in general

right now if i send something "bad" to ml.mlp, i get:

change printed error for unknown messages to something more legible than "ml.svm: message unhandled - inlet:0 args:0 symbol:getattributues"

would be nice to make the error more legible, for this and other classes

max/pd helpfiles in Git

i have max and pd helpfiles for ml.libsvm
i'd like to put them in git;
the releases should then pull them and include them in the downloadable archive.

looking into the future:
if the root folder of the ml-lib library is for all ml-ports we make, including the present libsvm and also others, then we should reorganize the folder structure; it'd be nice to see a folder for each ported lib (so far, only libsvm); it'd be nice if the name of each of these folders IS the name of the max/pd exteranl (so far, only ml.libsvm); it'd then be best if the max/pd help patches for that port are in that folder.

can we make that happen?

XCode compilation errors

Hello,
I'm trying to compile the latest pull in XCode 5.0.1

i have my dependencies set up (flext, libsvm, max, pd):

i am getting this error:

here's the text:
ld: warning: directory not found for option '-L/Users/ali/Library/Developer/Xcode/DerivedData/ml_libsvm-hkikijpcatmqewduettembkghmuh/Build/Products/Development'
ld: warning: directory not found for option '-F/Users/ali/Library/Developer/Xcode/DerivedData/ml_libsvm-hkikijpcatmqewduettembkghmuh/Build/Products/Development'
ld: library not found for -lflext-pd_sd
clang: error: linker command failed with exit code 1 (use -v to see invocation)

what am i missing?

The "getweights" message puts out a whole bunch of spaces:

Change naming convention to ml.libsvm

This requires creating a Flex "library" rather than just a standalone external

Make the class probability estimates a regular Pd list

please make this list a simple pd list, as opposed to a sparse list with colon's and spaces

ml.mlp and output vectors

what i imagine for MLP is a system where you associate N-dim input vectors with M-dim output vectors. during training, one would associate a number of samples of N-dimensional vectors to produce a certain M-dim output vector.

this is what wekinator does.

as example would be to use the 3 axis accel data and control a synth that has 7 parameters.

is this how this object works?
do you use "input_activation_function" to provide the object with desired OUTPUT vectors?

or basically: how do you provide ml.mlp with the desired M-dim output vectors that your provided examples of N-dim input vectors should map to?

consistency of naming for "enable_estimates" across ml.*

all the classes that output continuous estimates should call it the same thing.
right now we have "estimates" (ml.svm) and "enable_estimates" (others).

i'm also into method names that make intuitive sense regardless of the underlying CS names; "continuous" makes more sense to me than "estimates" for all the classes that have this feature.

what do you think?

inlet/messages in max

in max, you can usually Shift-click on an inlet and see all the messages an object can receive. for example, shift clicking on the 1st inlet of the "print" object shows:

when i shift click on the 1st inlet of ml.libsvm, i don't get a whole lot of meaningful help:

can this be fixed?

ali

Explain criteria for correct operation when "estimates 1" in help file

Fix usage statements for base classes

Build ml.libsvm for RPi

Semantics of "classify"

We are currently using the classify message as a way to get the "decision" from our objects for a given input. However the semantics of "classify" only really make sense for classification objects. Most of the objects fit this category, however some are "regression" objects, and for these classify doesn't make much sense. Should we stick with "classify" for simplicity and ease-of-use, or introduce a new message for the regression objects "map" springs to mind as an obvious choice.

Let me know...

ml.libsvm instantiation

i'm noticing an annoying feature of the way our library is made. typically, max/pd users place files that they want to use in the max/pd search-path. however, since the external ("ml") doesn't have the same name as the subclasses of the external whose names go into a max/pd object box (i.e. "ml.libsvm"), putting the external just anywhere in the search path is not sufficient; they must indeed be placed in the "start-up" folder. alternatively, you can put down an object in max/pd called "ml" (max/pd will give an error (because there is no object simply called "ml" in our library); but then the object loads and you can put down an "ml.libsvm" object with no problem.

surprisingly, max and pd both behave the same way on this.
having to put exteranals in the start-up folder is atypical and potentially confusing to new-comers.

what do you think?

Print "normalized 1" instead of "normalised" when normalisation is complete

Label the probability estimates with a word, e.g. "estimates"

During prediction, the right outlet reports the continues "confidence" weights; great; if this is what you mean by "weights" then please put the word "weights" before that list; if this is not what you mean by "weights" then please put another more appropriate word before the list

change ml.DTW to ml.dtw

please remove uppercase from ml.DTW to make ml.dtw, for consistency with max world

help file naming

the helpfiles for max and pd need to be named so that they are automatically opened when users option-click (max) or right-click-and-select-help (pd).

for max that would be:

ml.libsvm -> ml.libsvm.maxhelp
ml.dtw -> ml.dtw.maxhelp

for pd, you'd know better.

Set up separate development branch in Git repository

Mode doesn't automatically switch when loading training data from file

Steps to reproduce:

instantiate an ml.mlp
load some regression data from file

Expected:

the data loads and the object is set to regression mode (mode 1)

Actual:

the load operation fails

Suspected Reason:

ml.mlp attempts to load regression data into a classifier instead of automatically switching mode

Labelling vectors with "0" is not working

I'm training 4 positions, with labels/categories 0/1/2/3; i'm finding that svm never reports back a 0; are categories called "0" not allowed?

Add full help files for ml.dtw

ml.svm needs getattributes

this class, as well as every other class, should implement a "getattributes" messages that works the way the present ml.mlp class works

Incorporate help files into releases

The build system needs to properly incorporate help files into releases

error while instantiating release 4 in max 6.1

the object instantiates, but i get this error in the max window:

ml-lib - machine learning library for Max and Pure Data

version 0.3.2 (c) 2013 Carnegie Mellon University

error: Class ml not found in library!
ml.libsvm: Support Vector Machines using the libsvm library

ml.peak

this is a useful class.
however, what i really need is a peak detector that works on vectors of data as opposed to streams.
here's an example:
you do an fft and have the bin magnitudes for an audio signal; now you want to find the peaks in this series so that you can feed them to SVM for training and classification.
we spoke about this before; our research showed that the best algorithms where actually from chemistry (lots of noisy data there).

this was among the best we found:
https://github.com/xuphys/peakdetect?source=c

what do you think?

irllabs / ml-lib Goto Github PK

ml-lib's Introduction

ml-lib

Bug Reports and Discussion

Development Status

Installation

Compiling from source

Library structure

Object list

Pre-processing

Post-processing

Feature extraction

Classification

Regression

Credits

License

ml-lib's People

Contributors

Stargazers

Watchers

Forkers

ml-lib's Issues

version 0.3.2 (c) 2013 Carnegie Mellon University

Recommend Projects

Recommend Topics

Recommend Org