StatsD. Which implementation is still alive?

Ipeacocks
Ipeacocks
Nov 12, 2018 · 6 min read
Image for post
Image for post
credits https://es.kisspng.com/kisspng-mqkuzn/

StatsD is popular demon for gathering and sending statistic to other backends (e.g., Graphite, Datadog and other).

In a nutshell it poses like a proxy between client and long-term database and accordingly decreases amount of I/O operations to disk. Except that StatsD can manipulate of metrics data on-the-fly (counts percentiles, upper, lower values) and in some cases can diminish imperfection of backends like Graphite. For instance Graphite can’t save data more often than 1 per second and if client sends data few times during this minimum, Graphite will save only latests one. And here StatsD comes into play — it aggregates metric by itself and sends ahead to .

Almost original StatsD implementation is Etsy’s one. It’s written in NodeJS, well documented but apparently not supported for now. In this article I gonna tell you about alive StatsD servers on compiled languages.

Gostatsd

Atlassian project written on Go, feature-rich, supports sending metrics to Graphite, New Relic, AWS, Datadog. For my opinion it’s not so well documented but metric saving scheme is the same as in original implementation by Etsy.

Gostatsd needs to be compiled before using and I’ll show you how to do that on Ubuntu 16.04. First of all you have to install Go 1.10:

Specify $GOPATH:

Clone code and start compilation process:

If everything is successful check build directory:

Finally! Now let’s create basic config for gostatsd:

As you may notice I will use graphite backend as a long-term storage. legacy_namespace means old statsd scheme and as for me it’s quite strange. Now with this config we are ready to lauch gostatsd:

Logically all arguments should have similar options in config.toml but because of very basic documentation I can’t find them.

Metrics to statsd could be sent via variety of different libs or just using nc:

count and gauge in the names of metrics is not a random value. Graphite uses them for correct aggregation.

Image for post
Image for post
Image for post
Image for post

With all possible metric types and what do they mean you can get acquainted using next links:

https://github.com/etsy/statsd/blob/master/docs/metric_types.md

https://github.com/etsy/statsd/issues/157

https://github.com/etsy/statsd/blob/master/lib/process_metrics.js

If you are using Graphite as I am, don’t forget to configure data aggregation for Statsd. W/o it just averages values during retention period which is wrong in majority of cases.

Telegraf

Also on Go. No needs to compile it because InfluxData, company behind Telegraf, places it in own repos. Therefore installation is easy:

OK, Telegraf is not just statsd server it has a lot of possibilities like gathering remote server statistic or program performance data, writing it to different databases and so on and so forth. Seems it’s really good project, try it if you are interested in. We will use only its statsd interface so will clean everything useless:

I’ve really tried to make it as small as I can but for configuring production instance better to read documentation before.

That’s it. After restarting Telegraf is ready to accept statsd metrics on 8125 UDP port and send it to Graphite server graphite.address.example.com 2003 TCP port. As well as Gostatsd, Telegraf can count different percentiles. But there is no such diversity of timing & histogram stats: for example Telegraf can’t count upper_percentile, lower_percentile, count_percentile etc. And if you really need them — better to choose Gostatsd.

Image for post
Image for post

In terms of Etsy’s statsd or Gostatsd number_percentile is upper_number. I.e. 90_percentile in Telegraf is upper_90 in our previous statsd implementation. This fact has to be represented in aggregation rules of Graphite, smth like following:

Brubeck

This one is a project of GitHub, they use (or used) it in own infrasructure, written on C. Seems that has very good performance but as Gostatsd should be compiled before. And this is not too hard:

Compilation was done on Ubuntu 16.04 and w/o code modification Brubeck can’t work on newer releases because of dependencies. Let’s launch it after all:

As good start could be used default one config.default.json.example which is placed also in compilation directory with brubeck binary file.

Image for post
Image for post
Image for post
Image for post

Brubeck is missing many of the features of the original StatsD. Github only implemented what they felt was necessary for their metrics stack. For example there are no custom percentiles, can’t be changed default namespace for metrics etc. But the main problem of Brubeck that it’s not actively developed project and has overflow issues.

Again aggregation rules of Graphite need to be ajusted according names of percentiles.

That’s all folks!

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch

Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore

Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store