How we monitor asynchronous tasks

Pierre B.
Pierre B.
Jan 21, 2018 · 2 min read

In this post, I will walk you through how we monitor our celery workers. We use Bucky to collect metrics and push them into Graphite which is then queried by Grafana.


Celery monitoring

We collect metrics by simply subscribing to celery signals. We mostly use 4 signals:

  • prerun: before execution
  • postrun: after execution
  • success: on success
  • failure: on failure

Pretty straightforward.

You then just need a Bucky instance to push metrics to your Graphite database.

Below is an example of a dashboard showing the metrics of a set of tasks.

Task monitoring

This dashboard carries three pieces of information:

  • is it working?
  • timeline of events
  • task success and failures volumes

The timeline of events is simply counts of signals on a graph panel and volumes are singlestat panels with counts of signals.

Using diffSeries , simply compute the difference between the number of tasks received (prerun signal) versus the number of tasks that succeeded (success signal). Map values to text i.e OK=0 and NOK<0, and map values to background colors i.e green=0 and red<0. See value to text mapping and coloring.

Broker monitoring

In addition to monitoring tasks, we also monitor workers using broker metrics.

We have a homemade script that uses celery event receiver and Tornado IO loop. Below is a simplified version.

Below is an example of a dashboard showing the metrics of a set of workers.

Broker monitoring

Our monitoring screens carry very valuable information, that is why we put tv screens on the walls to spread information across the office to technical and non-technical people.

Grafana is a beautiful tool and helped us tremendously monitoring asynchronous tasks, micro-services, databases and servers.

MeilleursAgents Engineering

MeilleursAgents Engineering Teams (Product, Web & Data Teams)

Thanks to Pierrick.

Pierre B.

Written by

Pierre B.

Engineering manager @MeilleursAgents

MeilleursAgents Engineering

MeilleursAgents Engineering Teams (Product, Web & Data Teams)