Monitor Icecast and Wowza listeners with dockerized InfluxDB, Grafana and Go

As part of my freelance activity I’m working for a streaming media service provider that offers streaming solutions for radio stations. As part of our infrastructure we mainly use Icecast-Kh and Wowza media servers. Some of them operate as master servers and more than 10 servers are load-balanced edges.

To provide our customers with statistics about how there channels perform and to obtain an overview of our infrastructure we already analyze log files that Icecast and Wowza produce (using ElasticSearch and Go but that’s another story). But there is one huge drawback of this set-up: Both, Icecast and Wowza write log entries after a listener finishes its session. That’s totally ok to get an impression of listeners over a certain time in the past. But it is anything but real-time.

So we came up with these requirements:
* collect data every few seconds to see the actual situation
* aggregate data for every host and every mount (channel)
* save data to be able to do historical reports
* visualize data in a nice and easy to use dashboard
* run everything as Docker containers

Impression of the final Grafana dashboard

Introducing InfluxDB

Some month ago InfluxDB caught my attention. InfluxDB is a time-series database specialized to store data like metrics and time-based events. There are competitors like OpenTSDB and Prometheus but I like InfluxDB’s easy HTTP API for writing data, its powerful query language, its great performance, the build-in admin UI and after all InfluxDB is written in Go — my preferred language. So I decided to build our new monitoring upon InfluxDB.

Although there are already some InfluxDB Docker containers I came up with my own that you can find on my Docker hub.

docker run -d --name=influxdb -v <data-dir>:/data:rw -p 8083:8083 -p 8086:8086 pteich/influxdb

This command mounts a local directory of the host as data directory into the container and binds ports 8083 (admin UI) and 8086 (HTTP API) to the hosts network. If the provided data directory is empty, it creates a new InfluxDB admin user named docker and generates a random password. You can check out this password using docker logs influxdb.

If everything’s up and running you can connect to your hosts port 8083 and connect to your database with the just created credentials:

InfluxDB Admin UI

You can use the Query Templates button to create a new database and user. If you are more into command line interfaces use the interactive shell that’s part of InfluxDB and also partof my container. Assuming you started InfluxDB with the command above use this:

docker run -it --link influxdb:influxdb --rm pteich/influxdb /opt/influxdb/influx -host=influxdb

Key concepts of InfluxDB are measurements (in my case: listeners or response), tags (here host and mount with there associated values) and the actual values. Measurements act as containers (or problably tables in classical databases) that store values and marks them with tags (more accurate tag-key and tag-value).

Visualize Data with Grafana

Grafana is a great tool to create elegant and amazing visualizations and dashboards. In addition to InfluxDB it supports several other data sources like ElasticSearch or Prometheus that even can be mixed all together.

I use the official Grafana Docker image for my setup and mount a local host directory to persist Grafana’s settings database:

docker run -d --name grafana -p 3000:3000 -v <data-dir>:/var/lib/grafana grafana/grafana

After opening Grafana on port 3000 in a browser window, it’s now time to add our running InfluxDB as a data source:

Adding a new data source in Grafana

Collect data from Icecast

As stated above, it is not possible to gather the desired data using log files. But Icecast offers an admin interface which allows us to request statistics using HTTP calls with basic auth. For my needs I can use /admin/listmounts and /admin/stats. I created a Go service that connects to these endpoints on all our hosts and sends this data to InfluxDB. As a side effect I can measure how long it takes to connect to each Icecast host as an additional metric. Because InfluxDB’s line protocol for writing data is so easy, I can send all data directly using Go’s build-in HTTP client library.

/admin/listmounts
This Endpoint provides a XML document with a listener count per mount that looks somewhat like this:

<?xml version="1.0"?>
<icestats>
<source mount="/mount1">
<listeners>80</listeners>
<Connected>5397029</Connected>
<content-type>audio/aacp</content-type>
</source>
<source mount="/mount2">
<listeners>85</listeners>
<Connected>5396932</Connected>
<content-type>audio/mpeg</content-type>
</source>
</icestats>

My Go aggregator calls /admin/listmounts every few seconds for all of our hosts simultaneously. It then calculates a total listener count for every host and a total listener count for every mount across all hosts. (Notice: It’s not necessary to do this at this point because the same could be achieved using InfluxDB’s continuous queries but I need these values anyway for other services that have direct access to my data collector.)

This part of the Go service reads from /admin/listeners

/admin/stats
This endpoint provides a significantly larger XML document that contains a lot of information for every mount.

<?xml version="1.0"?>
<icestats>
<source mount="/mount1">
<audio_codecid>2</audio_codecid>
<audio_info>ice-samplerate=44100;ice-bitrate=192;ice-channels=2</audio_info>
<bitrate>192</bitrate>
<connected>5397563</connected>
<genre>Pop</genre>
<ice-bitrate>192</ice-bitrate>
<ice-channels>2</ice-channels>
<ice-samplerate>44100</ice-samplerate>
<incoming_bitrate>192272</incoming_bitrate>
<listener_connections>172</listener_connections>
<listener_peak>802</listener_peak>
<listeners>89</listeners>
<listenurl>http://xxxx:8080/mount1</listenurl>
<max_listeners>unlimited</max_listeners>
<metadata_updated>02/Nov/2015:12:58:22 +0100</metadata_updated>
<mpeg_channels>2</mpeg_channels>
<mpeg_samplerate>44100</mpeg_samplerate>
<outgoing_kbitrate>1518</outgoing_kbitrate>
<public>0</public>
<queue_size>110341</queue_size>
<server_description>Mount 1</server_description>
<server_name>Mount 1</server_name>
<server_type>audio/mpeg</server_type>
<slow_listeners>145</slow_listeners>
<source_ip>x.x.x.x</source_ip>
<stream_start>01/Sep/2015:02:39:27 +0200</stream_start>
<title>Artist - Title</title>
<total_bytes_read>129550843897</total_bytes_read>
<total_bytes_sent>913635379200</total_bytes_sent>
<total_mbytes_sent>871310</total_mbytes_sent>
</source>
</icestats>

Because the retrieval of this really huge document (above is the excerpt for only one mount) takes some time on hosts with loads of mounts, I only query it every minute. But nevertheless it contains one very interesting information: a timestamp for the stream start for each mount (XML node stream_start). This timestamp helps to detect short disconnects or general problems with stream sources that toggle. An added bonus is the meta-data that this XML contains (title node). I use this to make this data accessible for other services of our infrastructure.

Collect Data from Wowza Media Server

Wowza provides a similar HTTP API to query current listeners. This time the endpoint is /connectioncounts.xml but unlike Icecast it uses HTTP digest authentication.

<?xml version="1.0"?>
<WowzaMediaServer>
<VHost>
<Name>_defaultVHost_</Name>
<TimeRunning>8448800.34</TimeRunning>
<ConnectionsLimit>16000</ConnectionsLimit>
<ConnectionsCurrent>1424</ConnectionsCurrent>
<ConnectionsTotal>2073249</ConnectionsTotal>
<ConnectionsTotalAccepted>2073249</ConnectionsTotalAccepted>
<ConnectionsTotalRejected>0</ConnectionsTotalRejected>
<MessagesInBytesRate>594420.0</MessagesInBytesRate>
<MessagesOutBytesRate>1.3341501E7</MessagesOutBytesRate>
<Application>
<Name>channel1</Name>
<Status>loaded</Status>
<TimeRunning>4673955.291</TimeRunning>
<ConnectionsCurrent>999</ConnectionsCurrent>
<ConnectionsTotal>831897</ConnectionsTotal>
<ConnectionsTotalAccepted>831897</ConnectionsTotalAccepted>
<ConnectionsTotalRejected>0</ConnectionsTotalRejected>
<MessagesInBytesRate>160429.0</MessagesInBytesRate>
<MessagesOutBytesRate>9266311.0</MessagesOutBytesRate>
<ApplicationInstance>
<Name>_definst_</Name>
<TimeRunning>4673955.289</TimeRunning>
<ConnectionsCurrent>999</ConnectionsCurrent>
<ConnectionsTotal>831897</ConnectionsTotal>
<ConnectionsTotalAccepted>831897</ConnectionsTotalAccepted>
<ConnectionsTotalRejected>0</ConnectionsTotalRejected>
<MessagesInBytesRate>155545.0</MessagesInBytesRate>
<MessagesOutBytesRate>9266311.0</MessagesOutBytesRate>
<Stream>
<Name>http%3A%2F%2Fx.x.x.x%2Fchannel1</Name>
<SessionsFlash>6</SessionsFlash>
<SessionsCupertino>0</SessionsCupertino>
<SessionsSanJose>0</SessionsSanJose>
<SessionsSmooth>0</SessionsSmooth>
<SessionsRTSP>0</SessionsRTSP>
<SessionsTotal>6</SessionsTotal>
</Stream>

</ApplicationInstance>
</Application>
</VHost>
</WowzaMediaServer>

The interesting value here is SessionsTotal that is specified for each stream.

Enrich the Dashboard with Annotations

One great Grafana feature are annotations. They provide a way to mark specific points on the graph across all visualitions. I use annotions to show connection errors (every error that occurs in my Go aggregator is send to InfluxDB) but also events coming from other sources e.g. other services of our infrastructure that log to ElasticSearch.

Over the last month this set-up works really great in production and performs very well. It helped us to get a whole new real-time view at our infrastructure and to recognize problems before they become serious.