Service Health Checks, Alerts and a bit of Graphite Plotting

As summer interns on the Measure backend team in Bucharest, we had to implement functionalities for both Hootsuite Insights and Analytics. We worked on three main projects, involvins controllers and services.

What are controllers? Our data processing system is laid out as a pipeline. The entities within it are called controllers, and they communicate through queues. Data goes through the pipeline, where we might access extra information from our services or the Internet.

As for services, they define a functionality or a set of functionalities that different clients can reuse for different purposes, together with the policies that control its usage.

Health Check Service for Insights

Services are implemented using gRPC, for its simple service definition and automatic generation of idiomatic client and server stubs for a variety of languages and platforms. gRPC uses Protocol Buffers (a powerful binary serialization toolset and language) to define a service’s protocol and then it generates stubs which can be used in your server/client implementation.

The server side of a service is running inside containers within Kubernetes Pods. To be more precise, a dockerized server is deployed in Kubernetes, which handles the number of containers alongside its pods to automatically adjust to the “traffic” requirements.

So far, we know what a service is, how to implement it and where it will run. So, what about health checks? Kubernetes does not provide native health checks for gRPC services, so we decided to develop them ourselves.

Health checks are also implemented as a service. The main difference from a regular service is that you “attach” this service to the existing ones. How? You simply ‘register’ it along them.

grpcServer := grpc.NewServer(grpc.UnaryInterceptor(interceptors.CreateMonitoringInterceptor())) pb.RegisterOutboundAnalyticsServer(grpcServer, newServer(es_driver)) health_pb.RegisterHealthCheckServer(grpcServer, health_check.NewServer(make([]func() bool, 0)))

How does it actually work?

How do I make my own checks for my service?

We have already integrated health checks for all Insights services in production. For easier deployment on Kubernetes, we have also defined some Jinja Templates that use the health-checking system as readiness and liveness probes. Later on, the system has been also ported into Analytics by our colleagues.


One problem of this system is that every time someone wanted to modify the values of the parameters, a deployment needed to be done, which took a lot of time. We decided it would be a big improvement if we could dynamically configure these parameters.

A technology already used by Hootsuite for configuration management is Consul. Amongst other features, Consul provides a distributed system for hierarchical key-value storage. For ease of access, Consul exposes a WebUI which can be used to easily modify and adapt these values, without the need to go through the development process of modifying the code and redeploying it.

We configured the alert-generating system to use Consul for controllers and services alerts on both Insights and Analytics.

Services Alerts

  • generate metadata about services and their RPCs using the Protocol Buffer definitions
  • generate default alerts for each service using the previously obtained metadata and insert these values into Consul KV
  • modify the existing regenerating scripts (which generate .cfg files for Nagios) to use the values from Consul KV

Controllers Alerts

  • generate alerts in order to signal the unexpected behavior;
  • scale that controller’s instances up in order to get the jobs processed.

Every type of controller has a set of parameters it needs to stay within, defining the number of instances or the size and speed of the input queue. These values used to be defined statically into some yaml configuration files. We decided to load these values into Consul, in order to be able to fetch their updated values during the scaling process and at alert-generating time.

What we did was:

  • modify the auto-scaling and alert-generating mechanisms to use the values stored inside Consul; if Consul fails, we use the static values as fallback.

To simplify the developer’s life even further, we decided to use a web app, called Dynamic Config (previously developed during a Hackathon by some of our colleagues from Bucharest Office) that features a Flask server that accepts HTTP requests to modify or fetch the controllers’ Consul configurations and a web UI where developers can easily change or read these values, just by clicking a few buttons.

Analytics: Graphite Plotting of Mongo Calls from Go

The problem

Graphite is a timeseries database. It can plot different data over extended periods of time and it’s commonly used for monitoring the usage of different services.

The design choice

We had multiple ideas about how to achieve this, but we decided from the start that:

  1. The solution should be easy to use (just plug and play)
  2. The developer should not have to worry about how it’s implemented and they should still be able to use all of the methods and fields available in the mgo (Mongo Go) package
  3. Code readability should not be affected at all

Since in our Python code, this functionality was implemented using decorators, the first idea was to try and add them here, too. They can be implemented with reflection, as in the example found (here)[]. The major downside of this was that it affected the readability and that the decorator itself would be hard to understand for any maintainer that wanted to modify it later.

Another idea that came to mind was to define wrappers for the basic objects used from the mgo package. At first, we wanted to be able to write unit tests, so we had to mock the calls to Mongo by using an interface.

With anonymous fields, we were not required to implement all the methods on the wrapper class. If one has a `MongoDatabase` object and calls a method that is defined for the `MongoDBWrapperInterface` implementation, that method would be called in fact. Unfortunately, this design had a major flaw: if we wanted to access an attribute of `mgo.Database` (like `Name`) from the wrapper, it would not be visible by default at compile time, since interfaces do not have fields. You would have to use a work-around like `db.(MongoDBWrapper).Name` or require a `getName()` method on the interface, which would make the code really ugly.

After discussing with our mentors, we decided to switch to integration tests and avoid mocking for the moment. The final design looks like this:

type GraphiteMonitoredDatabase struct { *mgo.Database } type GraphiteMonitoredCollection struct { *mgo.Collection } func (db *GraphiteMonitoredDatabase) C(name string) *GraphiteMonitoredCollection { return &GraphiteMonitoredCollection{db.Database.C(name)} }

Some other functions are redefined too. Adding the calls to Graphite was pretty basic:

func (c *GraphiteMonitoredCollection) Find(query interface{}) *mgo.Query { startTime := time.Now() result := c.Collection.Find(query) elapsedTime := time.Since(startTime).Seconds() addMongoCallMetrics(c, "find", elapsedTime) return result

Using the wrappers instead of the actual mgo objects inside the code went smooth as well. Here’s an example from the tests:

func (suite *GraphiteMonitoredCollectionSuite) TestInsertAndFind() { t := suite.T() for i := 0; i < 5; i++ { suite.collection.Insert(&testType{TestInt: i}) } var findResult []testType suite.collection.Find(bson.M{}).All(&findResult)

The final result

Now all our Go components are also plotting their Mongo calls:


Of course, After 5 parties, a great team building trip, the everyday foosball and billiards championships as well as weekly basketball matches might have also contributed to our cool experience :).

About the Authors

Monica and Alex are fourth year students at University POLITEHNICA of Bucharest, Faculty of Automatic Control and Computer Science. They found their interest in technology during their highschool years and decided to follow their passion as a future career. In her spare time, Monica likes playing ping pong and reading. Alex enjoys occasional scuba-divings and has also been a professional dancer for 10 years. After a summer working at Hootsuite’s projects, they are prepared to successfully meet all the challenges of 4th year at faculty.

Hootsuite Engineering

Hootsuite's Engineering Blog

Hootsuite Engineering

Written by

Engineering at Hootsuite. Check out our code blog at

Hootsuite Engineering

Hootsuite's Engineering Blog