Giter Site home page Giter Site logo

haskell-spsa's Introduction

SPSA in Haskell Build Status

Simultaneous Perturbation Stochastic Approximation is a global optimizer for continuous loss functions of any dimension. It has a strong theoretical foundation with very few knobs that must be tuned. While it doesn't get the press of Genetic Algorithms or other Evolutionary Computing, it is on the same level with a better foundation.

Status: stablish

Documentation

Still working on this...

Getting Started

You can install via cabal (or cabal-dev)

cabal install spsa

Getting the Source

Get the spsa source.

git clone git://github.com/yanatan16/haskell-spsa spsa

Set up a sandbox.

The first time through, you need to download and install a ton of dependencies, so hang in there.

cd spsa
cabal-dev install \
    --enable-tests \
    --enable-benchmarks \
    --only-dependencies \
    -j

The cabal-dev command is just a sandboxing wrapper around the cabal command. The -j flag above tells cabal to use all of your CPUs, so even the initial build shouldn't take more than a few minutes.

cabal-dev configure --enable-tests --enable-benchmarks
cabal-dev build

Note: For the development mode use --flags=developer. Development mode allows you to run tests from ghci easily to allow some good ole TDD.

Running Tests

Once you've built the code, you can run the entire test suite in a few seconds.

dist/build/tests/tests +RTS -N

We use the direct executable rather than cabal-dev tests because it doesn't pass through options very well. The +RTS -N above tells GHC's runtime system to use all available cores. If you want to explore, the tests program (dist/build/tests/tests) accepts a --help option. Try it out.

Tests From GHCI

You can run all the tests from ghci.

cabal-dev ghci

starts up the REPL.

> import Test.SPSA
> runAllTests

Or you can run a single test

> runTest "SPSA*" -- follows test patterns in Test.Framework
> runGroup 4 -- if you know the group number

Running benchmarks

You can run benchmarks similar to tests.

dist/build/benchmarks/benchmarks

Just like with tests, there's a --help option to explore.

Using SPSA

SPSA minimizes a loss function, so maximization problems must negate the fitness function. It can optionally use a constraint mapper that runs each round of SPSA. SPSA is an iterative or recursive optimization algorithm based as a stochastic extension of gradient descent. Like the Finite Difference (FDSA) approximation to the gradient, SPSA uses Simultaneous Perturbation to estimate the gradient of the loss function being minimized. However, it only uses two loss measurements, regardless of the dimension of the parameter vector, whereas FDSA uses 2p where p is the dimension of the parameter vector. Surprisingly, this shortcut has no ill effect on convergence rate and only a small effect on convergence criteria.

Since SPSA uses a small amount of randomness in its gradient estimate, it also produces some noise. This noise is the good kind however, because it promotes SPSA to become a global optimizer with the same convergence rate as FDSA!

SPSA has fewer tuning knobs than many other global optimization methods, but it still requires some work. The most important parameter is a in the a(k) = a / (k + 1 + A) ^ alpha gain sequence (see tuning below). It can wildly affect results. If you wish to limit function measurements, since SPSA uses two function measurements per iteration, pass in N / 2 as the number of rounds to run when using SPSA.

For more information on SPSA, please see Spall's papers from his website.

Tuning

To tune SPSA, I suggest the Semiautomatic Tuning method by Spall introduced in his book Introduction to Stochastic Search and Optimization (ISSO). There are 3 main knobs, the two gain sequences ( a and c), and the perturbation distribution, delta. Delta is best chosen as Bernoulli +/- 1, as that is asymptotically optimal (it must not be Normal or Uniform, see ISSO). There are rules about the properties of the gain sequences. The standard form works well, and follows the forms:

a(k) = a / (k + 1 + A) ^ alpha
c(k) = c / (k + 1) ^ gamma

Here, the values alpha and gamma are tied together, asymptotically optimial is alpha = 1 and gamma = 1/6, but the values alpha = .602 and gamma = .101 work well for finite cases. A should be about 10% of the total iterations, and c should be approximately the standard deviation of the loss function in question. a is the most volatile parameter, and must be tuned carefully. In general, one should pick a such that the first step is not too large so as to send the algorithm in the wrong direction.

References

  • SPSA Website
  • Introduction to Stochastic Search and Optimization. James Spall. Wiley 2003.

License

The MIT License found in the LICENSE file.

haskell-spsa's People

Contributors

leroix avatar yanatan16 avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar

haskell-spsa's Issues

`A` parameter in semiautomatic tuning

semiautomaticTuning :: Int -> Double -> Double -> ([Double],[Double])
semiautomaticTuning nIter a c = (standardAk a (ceiling $ nIter / 10) 0.602, standardCk c 0.101)

Add ability to track progress of SPSA

such as recording theta's every so often or returning the iteration count

maybe api looks like:

(result, params) = runSPSAWithParams spsa guess
configSPSA = do
  setParamGatherer $ collectThetaEvery 5

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.