Giter Site home page Giter Site logo

xxxnell / flex Goto Github PK

View Code? Open in Web Editor NEW
124.0 9.0 14.0 42.53 MB

Probabilistic deep learning for data streams.

License: MIT License

Scala 76.27% Python 23.42% Shell 0.23% HTML 0.08%
scala functional-programming probability probability-density-function probability-distribution statistics data-stream

flex's People

Contributors

kailuowang avatar kmsiapps avatar leifwickland avatar namukpark avatar tenkeyless avatar xxxnell avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

flex's Issues

How about add CODE_OF_CONDUCT.md to flip?

Type of issue

  • Doc

Description

  • I think opensource project should have code of conduct. How about add CODE_OF_CONDUCT.md to flip?
    You could add CODE_OF_CONDUCT.md below link ๐Ÿ˜„

Improve KL-divergence accuracy

When calculating the KL-divergence, the boundary is vanishing. So, the calculation results doesn't included for it. Therefore, when the sampling number is too small (100>), or when the ratio of boundary is too high (0.01<), the numerical calculation result of KL-divergence is inaccurate.

Implement `SelectiveSketch`

Implement SelectiveSketch which performs deepUpdate selectively only when there is a discrepancy between the temporarily collected sample datas and the recorded distribution by Sketch.

Apply ND4J

ND4J, or N-Dimensional Arrays for Java is scientific computing libraries for the JVM. They are meant to be used in production environments, which means routines are designed to run fast with minimum RAM requirements.
It would be better to replace array computation.

Support for comprehension

Currently, Sketch only has a working monad operation, so if you include Dist except Sketch in for comprehension, it will not work properly.

Memo `icdf`

Getting icdf is an expensive operation. Therefore, cache icdf to improve performance.

Experiment with large scale `ConcatSmoothingPs`

When Sketch estimates the density distribution, too low a KL-divergence value is obtained because the boundary is not processed properly. Therefore, as a way of smoothing the edges, we use the large scale ConcatSmoothingPs and then re-examine KL-divergence when performing deepUpdate.

Modularize `smoothing`

smoothing operations are used in several places. The use of UpdateCmap and DeepUpdate is especially important.

As part of refactoring the smoothing operation, several methods should be applicable dynamically.

Execute all experiment codes in CLI

sbt experiment command in root should execute all experiments (c.f. flex.experiment package). However, for now, only one experiment is executed (with arg0). Therefore, the experiment command that does not have an argument must perform all the experiment codes. See Tasks.

Add various samplings

Flip seems to be able to compose various sampling methodologies such as MCMC or Gibbs.

sbt task to execute all experiments

So far I had to call runMain to run the implemented experiments.However, as the number of experiments increases, it is no longer possible to run all the experiments one by one. Therefore, sbt task to execute all experiments must be needed.

Plot with Measurable

Now plot contains primitive records only. However, in some cases, plot with measurable range, or RangeM would be useful.

Should sampling return Option?

Now sampling of SamplingDist returns Option of DensityPlot for empty structure Sketch. However, it can return DensityPlot.empty instead of None.

`icdfPlot` in `updateCmap` of `EqualSpaceCdfUpdate` doesn't returns infinity at 0 and 1.

Theoretically, inverse-cdf (quantile) returns ยฑโˆž at 0 and 1. However, due to the limitations of the way Sketch treats boundaries, this value only returns a finite large value.

For now, we take the approach of artificially removing the two values of the boundaries, but we need a more sophisticated way of getting a new Cmap in this function.

`bind` returns NaN

bind returns NaN for this configuration:

    val samplingNo = 50

    implicit val conf: SketchConf = SketchConf(
      startThreshold = 50,
      thresholdPeriod = 100,
      boundaryCorr = 0.1,
      decayFactor = 0,
      queueSize = 30,
      cmapSize = samplingNo,
      cmapNo = 5,
      cmapStart = Some(-10d),
      cmapEnd = Some(10),
      counterSize = samplingNo
    )

For more detail, see the code.

Abstract and separate `sampling`

I have now independently packaged the sampling algorithm to separate the sampling methods. However, the legacy is strongly combined, so one have to replace it with the new one.

See cmapForEqualSpaceCumCorr of EqualSpaceCdfUpdate

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.