StatsD. Which implementation is still alive?

credits https://es.kisspng.com/kisspng-mqkuzn/

StatsD is popular demon for gathering and sending statistic to other backends (e.g., Graphite, Datadog and other).

In a nutshell it poses like a proxy between client and long-term database and accordingly decreases amount of IO operations to disk. Except that StatsD can manipulate of metrics data on-the-fly (counts percentiles, upper, lower values) and in some cases can diminish imperfection of backends like Graphite. For instance Graphite can’t save data more often than 1 per second and if client sends data few times during this minimum, Graphite will save only latests one. And here StatsD comes into play — it aggregates metric by itself and sends ahead to .

Almost original StatsD implementation is Etsy’s one. It’s written in NodeJS, well documented but apparently not supported for now. In this article I gonna tell you about alive StatsD servers on compiled languages.

Gostatsd

Atlassian project written on Go, feature-rich, supports sending metrics to Graphite, New Relic, AWS, Datadog. For my opinion it’s not so well documented but metric saving scheme is the same as in original implementation by Etsy (and it is good). Gostatsd needs to be compiled before using and I’ll show you how to do that on Ubuntu 16.04. First of all you have to install Go 1.10:

# add-apt-repository ppa:gophers/archive
# apt update
# apt install golang-1.10-go -y
$ export PATH=$PATH:/usr/lib/go-1.10/bin
$ go version
go version go1.10.4 linux/amd64

Specify $GOPATH:

$ export GOPATH=$HOME/go
$ export PATH=$PATH:$GOPATH/bin

Clone code and start compilation process:

$ git clone https://github.com/atlassian/gostatsd.git $GOPATH/src/github.com/atlassian/gostatsd
$ cd $GOPATH/src/github.com/atlassian/gostatsd
$ make setup
$ make build

If everything is successful check build directory:

$ cd build/bin/linux/
$ ./gostatsd --version
Version: 9.1.0 - Commit: cc898d7 - Date: 2018-11-11-13:23

Finally! Now let’s create basic config for gostatsd:

$ vim config.toml
[graphite]
legacy_namespace=false
address ="graphite.address.example.com:2003"

As you may notice I will use graphite backend as a long-term storage. legacy_namespace means old statsd scheme, as for me it’s quite strange. Now with this config we are ready to lauch gostatsd:

$ ./gostatsd --verbose  --config-path config.toml --flush-interval "30s" --percent-threshold "75 90 95 99"

Logically all arguments should have similar options in config.toml but because of very basic documentation I can’t find them.

Metrics to statsd could be sent via variety of different libs or just using nc:

$ echo "foo.count:100|c" | nc -u -w0 gostatsd.address.example.com 8125
$ echo "foo.gauge:200|g" | nc -u -w0 gostatsd.address.example.com 8125
$ echo "foo.ms:200|ms" | nc -u -w0 gostatsd.address.example.com 8125

count and gauge in the names of metrics is not a random value. Graphite uses them for correct aggregation.

With all possible metric types and what do they mean you can get acquainted using next links:

https://github.com/etsy/statsd/blob/master/docs/metric_types.md

https://github.com/etsy/statsd/issues/157

https://github.com/etsy/statsd/blob/master/lib/process_metrics.js

If you are using Graphite as in my case don’t forget to configure data aggregation for Statsd. W/o it just averages values during retention period which is wrong in many cases.

Telegraf

Also on Go. No needs to compile it because InfluxData, company behind Telegraf, places it in own repos. Therefore installation is easy:

$ curl -sL https://repos.influxdata.com/influxdb.key | sudo apt-key add -
$ source /etc/lsb-release
$ echo "deb https://repos.influxdata.com/${DISTRIB_ID,,} ${DISTRIB_CODENAME} stable" | sudo tee /etc/apt/sources.list.d/influxdb.list
$ sudo apt-get update
$ sudo apt install telegraf

OK, Telegraf is not just statsd server it has a lot of possibilities like gathering by itself server statistic or program performance data, writing it to different databases and so on and so forth. Seems it’s really good project, try it if you are interested in. We will use only its statsd interface so will clean everything useless:

$ cat /etc/telegraf/telegraf.conf
# Telegraf Configuration
#
# Configuration for telegraf agent
[agent]
interval = "30s"
round_interval = true
metric_batch_size = 1000
metric_buffer_limit = 10000
collection_jitter = "0s"
flush_interval = "10s"
flush_jitter = "0s"
precision = ""
# Logging configuration:
debug = false
quiet = false
logfile = ""
hostname = ""
omit_hostname = false
# Configuration for Graphite server to send metrics to
[[outputs.graphite]]
servers = ["graphite.address.example.com:2003"]
# Prefix metrics name
prefix = "statsd"
# Graphite output template
# see https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_OUTPUT.md
template = "measurement.field"
# Statsd UDP/TCP Server
[[inputs.statsd]]
# Protocol, must be "tcp", "udp", "udp4" or "udp6" (default=udp)
protocol = "udp"
    # Address and port to host UDP listener on
service_address = ":8125"
    # Reset gauges every interval (default=true)
delete_gauges = false
    # Percentiles to calculate for timing & histogram stats
percentiles = [99,95,90,75]
    # separator to use between elements of a statsd metric
metric_separator = "."

I’ve really tried to make it as small as I can but for configuring production instance better to read documentation before.

That’s it. After restart Telegraf is ready to accept statsd metrics on 8125 UDP port and send it to Graphite server graphite.address.example.com 2003 TCP port. As well as Gostatsd, Telegraf can count different percentiles. But there is no such diversity of timing & histogram stats: for example Telegraf can’t count upper_percentile, lower_percentile, count_percentile etc. And if you really need them — better to choose Gostatsd.

In terms of Etsy’s statsd or Gostatsd number_percentile is upper_number. I.e. 90_percentile in Telegraf is upper_90 in our previous statsd implementation. This fact has to be represented in aggregation rules of Graphite, smth like following:

$ vim /etc/carbon/storage-aggregation.conf
...
[percentiles]
pattern = \.(\d+)?_percentile$
xFilesFactor = 0.1
aggregationMethod = max

Brubeck

This one is a project of GitHub, they use (or used) it in own infrasructure, written on C. Seems that has very good performance but as Gostatsd should be compiled before. And this is not too hard:

$ sudo apt install libjansson-dev libcrypto++-dev libmicrohttpd-dev libssl-dev -y
$ git clone https://github.com/github/brubeck.git
$ cd brubeck
$ ./script/bootstrap

Compilation was done on Ubuntu 16.04 and w/o code modification Brubeck can’t work on newer releases because of dependencies. Let’s launch it after all:

$ vim config.json
{
"sharding" : false,
"server_name" : "brubeck_debug",
"dumpfile" : "./brubeck.dump",
"capacity" : 15,
"expire" : 20,
"backends" : [
{
"type" : "carbon",
"address" : "graphite.address.example.com",
"port" : 2003,
"frequency" : 30
}
],
"samplers" : [
{
"type" : "statsd",
"address" : "0.0.0.0",
"port" : 8125,
"workers" : 4,
"multisock" : true,
"multimsg" : 8
}
]
}
$ ./brubeck --config=config.json

As good start could be used default one config.default.json.example which is placed also in compilation directory with brubeck binary file.

Brubeck is missing many of the features of the original StatsD. Github only implemented what they felt was necessary for their metrics stack. For example there are no custom percentiles, can’t be changed default namespace for metrics etc. But the main problem of Brubeck that it’s not actively developed project.

Again aggregation rules of Graphite need to be ajusted according names of percentiles.

That’s all folks!