StatsD-centered monitoring setup

Michał Łowicki
2 min readOct 18, 2015

--

At Opera we’ve tons of metrics. They’re used for business-related stuff like new sign-ins to synchronization per second or those used by our engineers to monitor our services, debug issues, find anomalies etc. Various services use different tools and below is presented setup used by some of them.

Central point for metrics on single host is a StatsD daemon. It’s a gateway through which we send all metrics to Graphite instance. To prefix each metric with its location our StatsD config has among others below properties:

"backends": ["./backends/graphite"],
"graphite": {
"globalPrefix": "sync.production.ams.front1",
"legacyNamespace": false
},
"graphiteHost": "stats.sync.ams.osa",
"graphitePort": 2003,

This way we don’t need to add prefix in other places like Logstash, Diamond, uWSGI or Python application.

Format of graphite.globalPrefix property is as follows:

project.deploy.datacenter.host

where deploy can be production, playground, staging etc.

Logstash has built-in support for StatsD as an output plugin.

Logging metrics from uWSGI to StatsD is also only an simple configuration change. Consider using statsd-no-workers option to reduce number of metrics.

Diamond is a Python daemon that collects system metrics as well as metrics from other programs like f.ex. RabbitMQ through pluggable collectors. Many people use tools like Munin but we’ve found many drawbacks there. Charts are much less readable comparing to Graphite / Grafana and Diamond allows to add collectors implemented in Python in a very simple way.

To log metrics from our application we’re using statsd package or wrappers dedicated for various frameworks like django-statsd-mozilla.

The above example presents setup used by our frontend boxes. On database boxes we don’t have uWSGI but we’re using f.ex. Jolokia through its collector in Diamond to monitor Cassandra nodes. Besides database or frontend machines we’ve many other roles and StatsD always plays a central role.

I strongly encourage to try out Grafana as we’ve found it much more easy to work with than dashboards in Graphite which are less readable and flexible. It has some really good features like annotations so you can display events (f.ex. releases) and correlate them with changes in metrics.

--

--

Michał Łowicki

Software engineer at Datadog, previously at Facebook and Opera, never satisfied.