Monitoring Akka Streams Kafka (Alpakka) Apps with Prometheus in Kubernetes

Scraping Consumer and Producer Metrics from any Scala or Java App

Jeroen Rosenberg
Jul 4, 2019 · 6 min read

TL;DR

This post focuses on monitoring your Kafka deployment in Kubernetes with Prometheus. Kafka exposes its metrics through JMX and so it does as well for apps using its Java SDK. To be able to have those metrics pulled in by Prometheus we need a way to extract them using the JMX protocol and expose them. This is where JMX Exporter comes in handy. It’s pretty effective to run this as a sidecar in your Kafka client application pods and have Prometheus scrape them using scrape annotations. For the impatient: all sample code is available here.

In my previous article “Monitoring Kafka in Kubernetes” I mainly focused on monitoring the server side of Kafka, while in this post we’re going to have a look at gathering and plotting its client side metrics.

Why to Monitor Kafka Client Applications

Message passing is becoming more and more a popular choice for sharing data between different apps, making tools like Kafka become the backbone of your architecture. A well-functioning Kafka cluster is able to handle lots of data, but poor performance or a degradation in Kafka cluster health will likely cause issues across your entire stack. Hence, it’s crucial to be on top of this matter and have dashboards available to provide the necessary insights.

Producer and Consumer metrics out-of-the-box

The Kafka Java SDK provides a vast array of metrics on performance and resource utilisation, which are (by default) available through a JMX reporter. It took me a while to figure out which metrics are available and how to access them. The fact that they changed a few times with several Kafka releases didn’t really help. Confluent provides a nice (and mostly correct) overview of the available metrics in the more recent Kafka versions. Kafka client metrics can be broken down into two categories:

As mentioned in my previous post there’s a nice write up on which metrics are important to track per category. For producers this would be things as request/response rates and byte rates. For consumers obvious ones are consumer lag and commit rates.

Before we dive into getting those metrics, let’s first setup our monitoring infrastructure. Our reporting backend of choice will be Prometheus with a Grafana frontend for building dashboards. As a platform we’ll assume Kubernetes which in my case is running on GKE. In the next paragraphs we’ll walk through the required steps to get up and running.

Setting up Prometheus on Kubernetes

There has been plenty written on how to setup Prometheus on Kubernetes so we’ll keep this short and concise. There’s Helm charts available to be used and there’s also a Prometheus Operator which both enable you to get going in a couple of minutes. For completeness I’ll include a minimalistic sample of my setup here

Note that the scrape_configs section in the ConfigMap contains jobs called kubernetes-pods and kubernetes-services. These contain the required configuration to query for pods respectively services labeled with Prometheus scrape annotations. We’ll get back to that later.

Setting up Grafana on Kubernetes

There’s Helm charts and Kubernetes Operators available for Grafana as well, but the Grafana setup is even simpler than the Prometheus one, so I’ll include it down here

Setting up Kafka on Kubernetes

For Kafka we can use the same setup as in my previous post

We also need zookeeper

Alright, now that we have a working Kafka cluster we might want to put some data in it. You could that manually by connecting to the cluster with a kafka-console-producer and enter some records or you could for instance setup a CronJob to periodically pull in some data from an external source. For example, consider the following CrobJob.

The scripts requests a “random” amount of JSON records from a Mockaroo collection and pipes it into a Kafka topic orders using kafkacat.

Creating a Kafka Client with Alpakka

Now that we’ve data flowing through our Kafka cluster, let’s create a small Alpakka application to consume that data. The Alpakka Kafka connector, formely known as Akka Streams Kafka or Reactive Kafka, lets us connect Kafka to Akka Streams.

We’ll start off with a basic build.sbt defining the one and only dependency

Then we’ll create a simple OrderConsumer object which sets up the consumer

That wasn’t too hard, was it? We’ve just created a plain source consumer with an auto subscription on the orders topic (to which we produce data with our CronJob). Let’s build and publish a docker image of our new app

$ sbt docker:publish

Now we can write a simple Kubernetes deployment descriptor so we can deploy our app. We’ll make sure to enable JMX on port 9999 by setting the appropriate flags in the JAVA_OPTS environment variable

env:
- name: JAVA_OPTS
value: "-Dcom.sun.management.jmxremote.port=9999 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false"

This ensures the consumer metrics from the Kafka Java SDK are exposed as JMX beans.

Scraping Kafka metrics from our pod

Now we need to make sure Prometheus can pull the Kafka consumer metrics in. Prometheus can pull metrics from HTTP endpoints, so we need a bridge to get the metrics from JMX and serve them. This is where JMX exporter comes in. We’ll add it as a sidecar to our deployment and configure it to query localhost:9999 since that’s where our JMX beans are acce. We’ll expose containerPort 5556, the default port JMX exporter exposes the metrics on, and name it metrics.

name: "jmx-exporter"          
image: sscaling/jmx-prometheus-exporter
ports:
- containerPort: 5556,
name: metrics,
protocol: TCP

By convention Prometheus will scrape pods for ports named metrics and queries metrics on /metrics. This path can be overridden by the use of an annotation, but since the default is respected by JMX exporter all we need to do is enable the scraping by means of an annotation

metadata:      
labels:
name: orders-consumer
annotations:
prometheus.io/scrape: 'true'

The full descriptor file looks like this

After deploying this we should see the fake orders being consumed from the orders topic and printed out on the container STDOUT logs. Our JMX exporter should already have collected and exposed some meaningful metrics. Let’s have a look at Prometheus to see if it managed to collect some data from our app. Pro tip: you can port forward to the prometheus pod for easy access: kubectl port-forward $(kubectl get po -lname=prometheus -o jsonpath=”{.items[0].metadata.name}”) 9090:9090 will make prometheus accessible on http://localhost:9090. We should see that Prometheus picked up our pod as a scrape target.

Prometheus found our pod as a scrape target

Thanks to the pod metadata Kubernetes provides, you can also see the labels which we can use for filtering later on. When we turn to the expression browser, we can see all sorts of Kafka metrics are already indexed.

Kafka Consumer metrics from Alpakka app

If we select one metric we can see many labels are available for filtering, since we might add more Kafka consumer (or producer) scrape targets later on.

Sample Kafka consumer metric with labels

Visualising Kafka consumer metrics in Grafana

Awesome! Now we can create some meaningful charts in Grafana. The easiest way would be to use port forwarding again: kubectl port-forward $(kubectl get po -lname=grafana -o jsonpath=”{.items[0].metadata.name}”) 3000:3000. Navigate to http://localhost:3000/datasources and add a Prometheus datasource. Thanks to our Service we can point to http://prometheus:9090. If we now would create a dashboard based on the new datasource we can see the data from the orders-consumer app flowing in!

Remember, you can use the labels to filter or drill down on. That way, you only need one dashboard for all your Kafka consumer apps.

Conclusion

It’s very easy to monitor more Kafka consumer and/or producer apps. Any application using the Kafka Java SDK is eligible for scraping metrics through JMX exporter (including Mirrormaker instances). There’s really no excuse for not keeping an eye on your consumers and producers.

Thanks for reading! All sample code is available at my github.

The Startup

Medium's largest active publication, followed by +581K people. Follow to join our community.

Jeroen Rosenberg

Written by

Dev of the Ops. Founder of Amsterdam.scala. Passionate about Agile, Continuous Delivery. Proud father of three.

The Startup

Medium's largest active publication, followed by +581K people. Follow to join our community.

More From Medium

More from The Startup

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade