Giter Site home page Giter Site logo

interval-metrics's Introduction

interval-metrics

Data structures for measuring performance. Provides lockfree, high-performance mutable state, wrapped in idiomatic Clojure identities, without any external dependencies.

Codahale's metrics library is designed for slowly-evolving metrics with stable dynamics, where multiple readers may request the current value of a metric at any time. Sometimes, you have a single reader which collects metrics over a specific time window--and you want the value of the metric at the end of the window to reflect observations from that window only, rather than including observations from prior windows.

Clojars

https://clojars.org/interval-metrics

Rates

The Metric protocol defines an identity which wraps some mutable measurements. Think of it like an atom or ref, only instead of (swap!) or (alter), it accepts new measurements to merge into its state. All values in this implementation are longs. Let's keep track of a rate of events per second:

user=> (use 'interval-metrics.core)
nil
user=> (def r (rate))
#'user/r

All operations are thread-safe, naturally. Let's tell the rate that we handled 2 events, then 5 more events, than... I dunno, negative 200 events.

user=> (update! r 2)
#<Rate@df69935: 0.5458097134611932>
user=> (update! r 5)
#<Rate@df69935: 1.0991987655505422>
user=> (update! r -200)
#<Rate@df69935: -22.796646374437813>

Metrics implement IDeref, so you can always ask for their current value without changing anything. Rate's value is the sum of all updates, divided by the time since the last snapshot:

user=> (deref r)
-21.14793138384688

The Snapshot protocol defines an operation which gets the current value of an identity and atomically resets the value. For instance, you might call (snapshot! some-rate) every second, and log the resulting value or send it to Graphite:

user=> (snapshot! r)
-14.292050358924806
user=> r
#<Rate@df69935: 0.0>

Note that the rate became zero when we took a snapshot. It'll start accumulating new state afresh.

Reservoirs

Sometimes you want a probabilistic sample of numbers. uniform-reservoir creates a pool of longs which represents a uniformly distributed sample of the updates. Let's create a reservoir which holds three numbers:

user=> (def r (uniform-reservoir 3))
#'user/r

And update it with some values:

user=> (update! r 2)
#<UniformReservoir@49f17507: (2)>
user=> (update! r 1)
#<UniformReservoir@49f17507: (1 2)>
user=> (update! r 3)
#<UniformReservoir@49f17507: (1 2 3)>

The value of a reservoir is a sorted list of the numbers in its current sample. What happens when we add more than three elements?

#<UniformReservoir@399d2d53: (1 2 3)>
user=> (update! r 4)
#<UniformReservoir@399d2d53: (1 3 4)>
user=> (update! r 5)
#<UniformReservoir@399d2d53: (1 3 4)>

Sampling is probabilistic: 4 made it in, but 5 didn't. Updates are always constant time.

user=> (update! r -2)
#<UniformReservoir@399d2d53: (-2 1 3)>
user=> (update! r -3)
#<UniformReservoir@399d2d53: (-2 1 3)>
user=> (update! r -4)
#<UniformReservoir@399d2d53: (-4 -2 3)>

Snapshots (like deref) return the sorted list in O(n log n) time. Note that when we take a snapshot, the reservoir becomes empty again (nil).

user=> (snapshot! r)
(-4 -2 3)
user=> r
#<AtomicMetric@3018fc1a: nil>
user=> (update! r 42)
#<UniformReservoir@593c7b26: (42)>

There are also functions to extract specific values from sorted sequences. To get the 95th percentile value:

#'user/r
user=> (dotimes [_ 10000] (update! r (rand 1000)))
nil
user=> (quantile @r 0.95)
965

Typically, you'd run have a thread periodically take snapshots of your metrics and report some derived values. Here, we extract the median, 95th, 99th, and maximum percentiles seen since the last snapshot:

user=> (map (partial quantile (snapshot! r)) [0.5 0.95 0.99 1])
(496 945 983 999)

Measuring your code's performance

The interval-metrics.measure namespace has some helpers for measuring common things about your code.

(use ['interval-metrics.measure :only '[periodically measure-latency]]
     ['interval-metrics.core    :only '[snapshot! rate+latency]])

Define a hybrid metric which tracks both rates and latency distributions.

(def latencies (rate+latency))

Start a thread to snapshot the latencies every 5 seconds.

(def poller
  (periodically 5 
    (clojure.pprint/pprint (snapshot! latencies))))

The measure-latency macro times how long its body takes to execute, and updates the latencies metric each time.

(while true
  (measure-latency latencies
    (into [] (range 10))))

You'll see a map like this printed every 5 seconds, showing the rate of calls per second, and the latency distribution, in milliseconds.

{:time 1369438387321/1000,
 :rate 316831.2493798337,
 :latencies
 {0.0 2397/1000000,
  0.5 2463/1000000,
  0.95 2641/1000000,
  0.99 2371/500000,
  0.999 9597/1000000}}

Don't like rationals? Who doesn't! It's easy to map those latencies to a less precise type:

(measure/periodically 5
  (-> latencies
      metrics/snapshot!
      (update-in [:latencies]
                 (partial map (juxt key (comp float val))))
      pprint))

...

{:time 699491725283/500,
 :rate 554.994854087713,
 :latencies
 ([0.0 9.388433]
  [0.5 39.118896]
  [0.95 50.673603]
  [0.99 53.583065]
  [0.999 57.83346])}

Kill the loop with ^C, then shut down the poller thread by calling (poller).

You can configure the quantiles, reservoir size, and units for both the rate and the latencies by passing an options map to rate+latency:

(def latencies (rate+latency {:latency-unit :microseconds
                              :rate-unit    :weeks}))

Performance

All algorithms are lockless. I sacrifice some correctness for performance, but never drop writes. Synchronization drift should be negligible compared to typical (~1s) sampling intervals. Rates on a highly contended JVM are accurate, in experiments, to at least three sigfigs.

With four threads updating, and one thread taking a snapshot every n seconds, my laptop can push between 10 to 18 million updates per second to a single rate+latency, reservoir, or rate object, saturating 99% of four cores.

License

Licensed under the Eclipse Public License.

How to run the tests

lein midje will run all tests.

lein midje namespace.* will run only tests beginning with "namespace.".

lein midje :autotest will run all the tests indefinitely. It sets up a watcher on the code files. If they change, only the relevant tests will be run again.

interval-metrics's People

Contributors

aphyr avatar pyr avatar

Stargazers

Timothée Peignier avatar Erich Ocean avatar  avatar Mark Sto avatar vemv avatar zhongxiao avatar Kyle Feng avatar simon luetzeslchwab avatar Adrian Lanzafame avatar Narendra Joshi avatar Jon Tanner avatar Justin Tirrell avatar Jiacai Liu avatar Vebjorn Ljosa avatar Vic avatar Priyatam Mudi avatar hamlet avatar  avatar larluo avatar Chris Blom avatar  avatar James Conroy-Finn avatar Eli Zor avatar Thomas Engelschmidt avatar Reid D McKenzie avatar Ian Barfield avatar Brian Hurlow avatar  avatar Andrea Crotti avatar  avatar Angus H. avatar gaurav patel avatar Martin Seeler avatar Martin Spasovski avatar Athan avatar Josef Galea avatar Ross Delinger avatar Zak Kristjanson avatar Michael Hood avatar Eric Bailey avatar Michael Drogalis avatar Graydon Hoare avatar Laco Gubík avatar Bahadir Cambel avatar Changsu Jiang avatar Gleicon Moraes avatar Raphael Randschau avatar Khaja Minhajuddin avatar Nate Smith avatar Zeeshan Lakhani avatar Jeremy Heiler avatar Michael Rubanov avatar Yan avatar Bruno Vecchi avatar Benjamin G avatar Chris McDevitt avatar bai avatar Andrii Mishkovskyi avatar  avatar Dominic LoBue avatar Michał Marczyk avatar James Thornton avatar Andy Fingerhut avatar Thomas Heller avatar Philipp Küng avatar Andrey Antukh avatar Will O'Brien avatar Zev Blut avatar Javier Honduvilla Coto avatar  avatar Thomas Matthijs avatar Edward Wible avatar Mike Fletcher avatar Don Jackson avatar Ryan McGowan avatar Ben Mabey avatar  avatar Homer Strong avatar Paul Hammond avatar Arnout Roemers avatar Tom Hickey avatar Frank Shearar avatar Michael Bridgen avatar Volodymyr Epifanov avatar Joël Perras avatar Julien avatar Ali Baitam avatar Paul Bergeron avatar Max Barnash avatar Bruce Durling avatar Andreas avatar Markus Kohler avatar Henrik Feldt avatar Cara Mitchell avatar Johan Prinsloo avatar Joel Boehland avatar Nathan Witmer avatar Ronen avatar Jay avatar Cagatay Kavukcuoglu avatar

Watchers

 avatar Max Penet avatar aliraza avatar hamlet avatar James Cloos avatar  avatar

interval-metrics's Issues

update! on rate+latency returns a regular rate

This was a little surprising to me:

ns=> (def r (rate+latency))
ns=> (type r)
interval_metrics.core.RateLatency
ns=> (def u (update! r 1))
ns=> (type u)
interval_metrics.core.Rate

I think maybe this should just return this. Happy to write a PR.

The uploaded jar on clojars does not contain interval-metrics.core/rate+latency

Hi Kyle,

When dependencies are pull from clojars, riemann won't start because of a interval-metrics.core/rate+latency does not exist. This is not a version mismatch since 0.0.2 contains the code on github, I guess a previous jar was pushed with the right pom.xml.

Releasing a 0.0.3 would fix this (with correct dependencies in riemann).

File t_measure.clj could be removed

If not removed, it should have its namespace changed. Right now its namespace does not match its file name, and is a duplicate of the namespace defined in file t_core.clj

Found using pre-release version of Eastwood Clojure lint tool

Reservoir race condition

Reservoir updates its size before writing a new value in. A concurrently reading thread might see the initial value from the array. We should fill the array with a default value (say, Long.MIN_VALUE), document that choice, and filter out that value during derefs.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.