Server Monitoring in a Docker World

Rho AI
2 min readApr 1, 2016

--

It is very helpful (and wise) to keep tabs on your servers’ resource utilization — hence the rise of services like NewRelic and DataDog, along with applications such as Nagios.

If you search for the general topic of “Docker Host Monitoring”, the majority of articles and available tools relate to monitoring the Docker *containers*. While this clearly has its place, it does not allow you to monitor the *host machine* resources.

The tools that do exist for *host machine* monitoring almost exclusively depend on installing some sort of agent software on the host machine itself. This is at odds with a fully Dockerized application architecture in the sense that a fully Dockerized application should have no host dependency other than Docker itself. The notion of ‘no dependencies other than Docker’ becomes even more important as you move from an application architecture that requires multiple servers to distributed servers and, eventually, to transient servers (e.g. spot instances on AWS).

To address this need, we built a Docker image that monitors its *host machine* resources in a very simple fashion: collect resource usage stats (with read-only access) and log them to stdout.

You can see the source code here:

To use this image, simply launch a single container per host and mount the `/proc` directory in read-only mode to `/prochost`. By default it will report CPU, Memory, and Disk utilization every 5 seconds. This is all configurable.

docker run -v /proc:/prochost:ro pitrho/docker-host-stats -cmd -f 30

Example output:

CPU Usage: 2% Memory Usage: 75.28% (11.59GB of 15.40GB) Disk Usage: 9% (5.1G/59G)

If you are using a tool such as Rancher, this becomes very convenient — particularly for the case of an architecture that includes transient hosts. Create a new service that is scheduled to run globally (on any new host) and you have an automated, albeit primitive, resource utilization monitor. The following example shows a Rancher-specific `docker-compose` file that logs CPU, Memory, and Disk utilization every 30 seconds.

docker-host-stats:
labels:
io.rancher.scheduler.global: 'true'
command:
- -cmd
- -f
- '30'
image: pitrho/docker-host-stats
volumes:
- /proc:/prochost:ro

The power of this approach is, in some ways, its simplicity. This log output can be used for alarms/alerts/triggers as you see fit, which amounts to your own basic version of one of the more popular server monitoring services! We are currently using LogEntries for this (we deployed our own HA Graylog2 setup but an external service is more cost effective given our log volumes right now).

Let us know how you have solved this need or if you can think of ways to improve this approach!

by Gilman Callsen • @gcallsen • Founder & CTO of Pit Rho

--

--

Rho AI

Rho AI builds customized data science products and services to solve real-world problems.