Giter Site home page Giter Site logo

metropolis's Introduction

Metropolis generator for Raku language

.github/workflows/test.yml

Metropolis–Hastings algorithm is Markov chain Monte Carlo (MCMC) method for generating sequence of random samples from probability distribution function.

It can be used to generate naturally looking data for applications testing and presentation. For example let's assume that you have software for online marketing campaigns. And you have to generate fake traffic (100 000 site visits) that occurred after campaign started 10 days ago to record tutorial for users. It's obvious that such traffic is not constant. After campaign start traffic is low, then it quickly gains momentum, reaches interest peak and slowly fades away like this:

Campaign traffic

f( x ) = 2x / ( x^3 + 0.5 ), { x > 0 } generated using awesome Desmos calculator.

With this module you will be able to generate such random samples with any probability distribution function within given domain.

Note: This is raw and fast implementation of Metropolis–Hastings algorithm. You may need to turn few dials to get best results for your cases. If you get bad results or timeouts be sure to read TWEAKS section.

TABLE OF CONTENTS

SYNOPSIS

use Metropolis;

my $m = Metropolis.new(
    function => -> $x { 2* $x / ( $x ** 3 + 0.5 ) },
    domain => 0 .. 10
);

say $m.sample( ) for ^42;

METHODS

new

Params:

  • function (mandatory) - probability distribution function, must accept exactly one positional argument.
  • domain (mandatory) - range within which samples should be generated.
  • jumper (optional) - function that will return new sample from given jumping distribution. Check TWEAKS section for more info.

sample

Params:

  • timeout (optional) - how long can algorithm look for next sample, no timeout by default.

Get next sample value of Num type. Nil will be returned in case of timeout.

graph

Params:

  • samples (mandatory) - how many samples should be gathered.
  • scale (optional) - how high y axis should be, 100 characters by default.

Crude plotting to quickly visualize requested distribution. With swapped axis (x is vertical, turn your monitor). All values are rounded to nearest integer. All bars are rounded to nearest 1/8. For example to display distribution from SYNOPSIS:

my $m = Metropolis.new(
    function => -> $x { 2 * $x / ( $x ** 3 + 0.5 ) },
    domain => 0 .. 10
);

$m.graph( samples => 10000, scale => 10 );
 0 ███▋ (1622)
 1 ██████████  (4545)
 2 ████ (1800)
 3 █▊ (803)
 4 ▉ (417)
 5 ▌ (251)
 6 ▌ (208)
 7 ▎ (135)
 8 ▎ (106)
 9 ▎ (91)
10 ▏ (22)

Potato-grade but useful :)

TWEAKS

Common issues:

  • Some places have accumulation of samples way too high for requested probability distribution there.
  • Algorithm is not reaching every part of the domain. Some zero probability gaps (like in f( x ) = sin x function) are not crossed by algorithm at all, places beyond them do not generate samples.
  • Sampling is realy, really slow.

Solutions:

  • Don't go crazy with domain width. By default algorithm is tuned to work with domain of total width <= 10. Remember that you get Num values and you can project them to bigger domain (like for example 864_000 timestamps from the last 10 days) if needed.
  • Avoid functions with zero probability gaps within domain. This algorithm can handle them to some extend. But you will get best results if probability is always positive across whole domain. Sometimes it is better to generate samples for each positive probability area separately.
  • Burn samples to forget initial state. This algorithm is based on Markov chain Monte Carlo method which needs warmup to eventually converge to the desired distribution. For example throw away first 1000 samples.

Expert:

You can provide your own jump function with desired distribution. By default normal distribution with standard-deviation = 2 is used. Uniform distribution is also provided with default delta = 3.

To use prebuilt and/or change its default:

my $m = Metropolis.new(
    ...,
    
    # let's jump bigger zero probability gaps!
    jumper => &uniform-distribution.assuming( delta => 15 ) 
);

To provide own:

my $m = Metropolis.new(
    ...,
    
    # discrete decrease of longer jump distance probability
    jumper => sub ( :$mean ) { # mean is current sample position
        
        # roll jump distance within 0..1 range
        my $jump = rand;
        
        # make small jumps less probable by randomly re-rolling
        $jump max= rand if $jump < 0.5;
        
        # return new sampling point
        # remember output must have equal chance to go both ways!
        return $mean + (1, -1).pick * $jump;
    }
);

metropolis's People

Contributors

bbkr avatar japhb avatar

Stargazers

 avatar  avatar

Watchers

 avatar

Forkers

japhb

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.