Giter Site home page Giter Site logo

statsd's Introduction

A Ruby client for StatsD

Installing

Bundler:

gem "statsd-ruby"

Basic Usage

# Set up a global Statsd client for a server on localhost:9125
$statsd = Statsd.new 'localhost', 9125

# Set up a global Statsd client for a server on IPv6 port 9125
$statsd = Statsd.new '::1', 9125

# Send some stats
$statsd.increment 'garets'
$statsd.timing 'glork', 320
$statsd.gauge 'bork', 100

# Use {#time} to time the execution of a block
$statsd.time('account.activate') { @account.activate! }

# Create a namespaced statsd client and increment 'account.activate'
statsd = Statsd.new('localhost').tap{|sd| sd.namespace = 'account'}
statsd.increment 'activate'

Testing

Run the specs with rake spec

Performance

  • A short note about DNS: If you use a dns name for the host option, then you will want to use a local caching dns service for optimal performance (e.g. nscd).

Extensions / Libraries / Extra Docs

Contributing to statsd

  • Check out the latest master to make sure the feature hasn’t been implemented or the bug hasn’t been fixed yet

  • Check out the issue tracker to make sure someone already hasn’t requested it and/or contributed it

  • Fork the project

  • Start a feature/bugfix branch

  • Commit and push until you are happy with your contribution

  • Make sure to add tests for it. This is important so I don’t break it in a future version unintentionally.

  • Please try not to mess with the Rakefile, version, or history. If you want to have your own version, or is otherwise necessary, that is fine, but please isolate to its own commit so I can cherry-pick around it.

Contributors

  • Rein Henrichs

  • Alex Williams

  • Andrew Meyer

  • Chris Gaffney

  • Cody Cutrer

  • Corey Donohoe

  • Dotan Nahum

  • Erez Rabih

  • Eric Chapweske

  • Gabriel Burt

  • Hannes Georg

  • James Tucker

  • Jeremy Kemper

  • John Nunemaker

  • Lann Martin

  • Mahesh Murthy

  • Manu J

  • Matt Sanford

  • Nate Bird

  • Noah Lorang

  • Oscar Del Ben

  • Peter Mounce

  • Ray Krueger

  • Reed Lipman

  • rick

  • Ryan Tomayko

  • Schuyler Erle

  • Thomas Whaples

  • Trae Robrock

Copyright © 2011, 2012, 2013 Rein Henrichs. See LICENSE.txt for further details.

statsd's People

Contributors

agis avatar ajedi32 avatar atmos avatar aw avatar ccutrer avatar elyahou avatar erez-rabih avatar gaffneyc avatar gburt avatar hannesg avatar j-manu avatar jeremy avatar jnunemaker avatar jondot avatar jorgemanrubia avatar kyrylo avatar lann avatar mzsanford avatar natebird avatar noahhl avatar olleolleolle avatar oscardelben avatar petergoldstein avatar raggi avatar raykrueger avatar reinh avatar rtomayko avatar schuyler avatar technoweenie avatar trobrock avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

statsd's Issues

sample_rate should only be supported for counters.

I think the sample_rate argument should be removed from gauge, set, and time/timing. Allowing users to set a sample rate which statsd will simple ignore invites misunderstandings & confusion about what's actually being recorded & flushed to graphite.

https://github.com/etsy/statsd/blob/53aae0ad2cd33879d79a786e067a5e3fd2548e1d/stats.js#L167 Note that the 3rd field (the sample rate) is only acted on if the type is not ms, g, or s.

I will submit a PR if this is something you'll accept.

UDPSocket usage floods connections

It seems that the current usage of UDPSocket is massively flooding my network's connection pools, saturating them and bringing things to a messy halt. To wit:

def send_to_socket(message)
  socket.send(message, 0, @host, @port)
end

def socket
  Thread.current[:statsd_socket] ||= UDPSocket.new
end

opens a new connection for every call to send_to_socket (Ruby does reuse the object, however!). I can run code that fires off a bunch of metric messages and watch my active connection count skyrocket in my DD-WRT control panel. These connections are cleaned up by my router 30 seconds later, but I can easily saturate my connection pool.

I've modified my local copy to open the socket once, then hang onto it:

def send_to_socket(message)
  socket.send(message, 0)
end

def socket
  Thread.current[:statsd_socket] ||= UDPSocket.new.tap do |socket|
    socket.connect @host, @port
    at_exit { socket.close }
  end
end

This solves the connection storming issue, and my data is still being sent. Is there a reason that it was done the way it is? I don't want to regress anything, but it seems pretty obvious that opening a new socket per message is massively suboptimal.

multiple calls to Statsd.new doesn't recreate socket, address family conflicts

I'm using the IPv6 support from PR #46.

Because it uses only a single per-thread socket and the creation of the socket happens only once, calling .new again with a hostname that is either a literal address of another address family or a name that resolves to only an address of another address family, it stops being able to send because sendto(2) fails with EAFNOSUPPORT (Address family not supported by protocol), and stats are silently discarded.

$statsd1 = Statsd.new '127.0.0.1', 9125 # socket type is AF_INET, seen in lsof
$statsd1.increment "abc"
$statsd2 = Statsd.new '::1', 9125 # socket type should be AF_INET6, but is not
$statsd2.increment "abc" # sendto(2) errors out, seen in strace

Because Thread.current[:statsd_socket] is already defined, the object instance returned by the second call to .new, $statsd2, can't use the IPv4 socket created from the first call.

Creating multiple instances might be done if you wanted to use multiple statsd servers for different kinds of metrics. Right now, you can call .new again but it only works if the address family of the hostname in the second call is the same as the first. This problem will also come up if DNS returns either A or AAAA records (but not both) on subsequent attempts to send.

I see a couple of fixes, and which one is appropriate is based on how you intent for this library to work.

  1. If you want this library to produce a singleton instance as the result of calling .new, then when .new is called, if Thread.current[:statsd_socket] is a socket, close it and set it to nil. The next time .socket is called, it will create a new socket of the appropriate address family that matches the address family of the hostname passed. The object returned is effectively (thread) global even if the variable holding the instance isn't a ruby global variable.

  2. Recreate a socket of the appropriate address family on every call to send_to_socket, or perhaps only do that conditionally by storing the address family of the last created socket in thread local storage and only recreating it if the address family of the latest hostname used is a different address family.

  3. Make it support multiple instances by storing the socket in the object instance instead in thread local storage. This way, each call to .new gets its own socket that matches the address family of the hostname passed in.

  4. Convert to a UDP "connected" socket and store it in the object instance. Connect it at creation time and don't bother to store the hostname and port in the object instance (it would no longer be necessary). This has the added bonus of not having to do repeated DNS lookups if a hostname (vs an address) is given (negating that warning about performance and nscd in the README).

Since the socket is UDP and has no on-the-wire protocol state to be maintained (unlike, say, a database connection over TCP), there should be no problems with sharing it between threads, so the socket can be stored on the instance object rather than in thread local storage. This would make options 3 or 4 easy with less code.

The way the docs are written seem to indicate that you should be able to use multiple object instances (and you can, as long as the address family doesn't change), but the way the tests are written exploits the fact that there's only one socket created in TLS (my manipulating it directly to set up the tests).

DNS bottleneck

(reported this bug in another statsd client, which reinh/statsd also exhibits same behavior):

Currently, the DNS service is queried once per sent packet if the supplied host is a hostname and not an IP address.

Ideally, the hostname is resolved only once (ever, or per configurable X seconds) and the resulting IP address is what's used in sending the UDP packet.

Now that I think of it, I believe (though untested) that if the DNS service is down or slow, the sending of the UDP packet in the current code will block while the resolving library tries to resolve the hostname to an IP address - so the whole benefit of UDP being fire-and-forget is overshadowed by the blocking DNS operation before the sending.

Ideally, DNS resolution (wherever it's done - by this code or lower) is done with an appropriate, configurable timeout - and not done too often.

Send data with specific timestamp

Currently we're using this gem to send data to graphite, it's working perfectly, but we would like to add a method to send the data plus a specific time stamp. All this so we can pass the old data we have been collecting over the last months with the date it was created, and not the day we sent it.
Perhaps you know a gem that already does this or maybe you are interested in our contribution, let us know!
Thanks

Needs new maintainer

I don't use Ruby any more, let alone Statsd.

This project needs a new maintainer.

add support for batching

From Etsy's statsd documentation: "All metrics can also be batch send in a single UDP packet, separated by a newline character."

It would be great if one could start a batch session (within a request for instance) and flush it at the end which would only push a single UDP packet.

Gauge support

Is this gem still maintained? Would it be possible to add gauge support? There are already a lot implementions of it.

Fix the build!

It looks like the Travis CI build is failing. We should fix that!

Stats with sample rates < 1 are being sent randomly?

So reading over the source code, I found this line:

def send_stats(stat, delta, type, sample_rate=1)
    if sample_rate == 1 or rand < sample_rate # <-- Wait, what?
      # Replace Ruby module scoping with '.' and reserved chars (: | @) with underscores.
      stat = stat.to_s.gsub('::', '.').tr(':|@', '_')
      rate = "|@#{sample_rate}" unless sample_rate == 1
      send_to_socket "#{prefix}#{stat}#{postfix}:#{delta}|#{type}#{rate}"
    end
  end

I realize the idea is to only send stats sample_rate*100% of the time, but this implementation seems really strange to me.

First of all, using rand instead of some kind of cycling counter seems like a bad idea, as variances in the random number generator could result in long successive runs of either stats not being sent, or always being sent.

Secondly, my personal expectation about the sample rate argument was that it was simply a way of telling the statsd server what the sample rate was, not a way of telling the statsd gem how often to actually send the metric that I just told it to send. That seems like very strange behavior to me.

Minimum RubyGems Version

Hi Rein,

Any idea why 1.3.6 is the minimum rubygems version required?

$ gem install statsd
ERROR:  Error installing statsd:
    statsd requires RubyGems version >= 1.3.6
$ gem --version
1.3.2

Thanks,
Jerry

Statsd#time not resilient to exceptions

The Statsd#time call will fail to record any timings if the block raises an exception. Since execution time can be tightly correlated with exceptional circumstances, it'd be great if these timings were recorded as well.

Add to rubygems

Currently this library doesn't exist in rubygems, it seems.

Drop 1.8.7 support

Simplecov no longer supports it, it's no longer supported itself.

Propose removing the json dep just added, and setting required_ruby_version > 1.9.3.

All those in not in favor speak promptly or forever backport on your own time.

cc @reinh

UDPSockets leaked under load

Ran into some issues today with our Sidekiq/Statsd setup. We have a nice sidekiq middleware component that wraps each job in a "time" block; it works great.

The problem is that, under load, garbage collection does not get to run and all of the UDPSockets attached to threads are left behind.

To remedy this I added a tweak to the middleware to clean up the UDPSocket after the job finishes.

class StatsdMiddleware
  def call(worker, msg, queue)
    statsd = #get statsd however
    statsd.time("#{worker.class}.perform") do
      yield
    end
  ensure
    socket = Thread.current[:statsd_socket]
    socket.close if socket
    Thread.current[:statsd_socket] = nil
  end
end

This is then added into Sidekiq as a server middleware...

Sidekiq.configure_server do |config|
  config.server_middleware do |chain|
    chain.add StatsdMiddleware
  end
end

This may be applicable to other threading/forking situations like unicorn and passenger as well.

I could package this up as an optional class or something, not sure if that's going too far. If nothing else, a wiki page might be good.

Arbitrary metric name parsing

Hi,

In line 387 of file lib/statsd.rb we replace "::" with "." to have directory structure which resembles the ruby module scoping.

While I understand the logic behind this decision, I'd like to propose to make it configurable and not mandatory. I for example, would like to replace "::" with "-" and not "." to have flat metric hierarchy.

If this idea is accepted I'm also willing to make a PR.

I'd be happy to hear your thoughts.

Add changelog

Fortunately this project isn't too large in terms of commits, but it'd be a lot easier to tell what changed from release-to-release if there were a changelog. It'd also be a good place to expound upon anything that was only hinted at briefly in the commit message.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.