An intimate look at your Kafka cluster with Klustr

Katrina Villanueva
4 min readFeb 25, 2021

--

Get to know your Kafka cluster by using the newly-launched open-source visualization monitoring tool, Klustr.

Credit: @ThisisEngineering RAEng

Before diving into Klustr, let’s start from the core of it all by asking… what is event streaming?

Patience is a virtue, except when it comes to technology. Users expect instant gratification at every turn, and real-time feedback is the gold standard. You deposit a check through your phone and *boom*, your account balance is automatically updated. You are currently watching “You’ve Got Mail” on Netflix and *boom*, your recommended movies are now all rom-coms. You are waiting for the bus, send a quick status check over text and *boom*, you receive a response that you missed it by a minute, and “Drat!” now you have to walk instead.

Event streaming is responsible for all these light-speed luxuries. It captures written data (aka events) in real-time from databases, cloud services and applications which will read, store, process and route them in a continuous flow to make sure they get to where they need to go, immediately.

Apache Kafka is the leader in this event-streaming world. With client businesses running full-speed all day, everyday, this open-source platform keeps up with their demands by automating the data transmission process in a highly scalable, fault-tolerant, and secure little package.¹

Ok… but really… how does it work?

At its core, Kafka is a distributed system consisting of a cluster of one or more servers, paired with clients that communicate with that cluster. Let’s break that down further, shall we?

Producers write events that are sent to the Kafka cluster. ➜ Events enter the storage layer, called brokers. ➜ Consumers read and subscribe to those events. Within the brokers, events are categorized by Topics. ➜ Topics are Partitioned into different “buckets” spread out across the different brokers.

Because thousands of companies rely on Kafka’s powerful message queueing system for high-performance data pipelines, streaming analytics and data integration, there is no room for errors! Software engineers have to sideline as doctors to make sure their clusters are in great health and for that, they need a reliable and user-friendly monitoring tool to handle the job.

Enter, Klustr.

Klustr is a graphical user interface for navigating and monitoring pertinent metrics related to your Kafka Application’s rate of message consumption and data usage, with results displayed in real-time.

Using KafkaJS, Klustr grabs cluster information and displays it in easily comprehensible tables. Are you curious to know how many Topics and Consumers there are? Klustr’s got you. Do you want to see what partitions there are in each topic and what their current offsets are? Klustr will deliver. Interested in seeing a relational view of the Cluster and its Brokers? Klustr lays it all out.

But you don’t only care about cluster information, right? You’d want to know if your Kafka cluster is healthy! Klustr uses JMX Exporter to reveal additional metrics that will convey exactly that. The top 3 are:

  • Active Controllers — There can only be one! This guy controls and coordinates the list of partition leaders and you should worry if there is any other value.
  • Underreplicated Partitions — In a healthy cluster, the number of in-sync replicas should equal the total number of replicas. If a broker becomes unavailable, those numbers will not match up and this metric will increase dramatically, warranting an immediate investigation. Make sure it stays at 0!
  • Offline Partitions — This metric reports the number of partitions without an active leader. Because all read and write operations are only performed on partition leaders, you should alert any non-zero value for this metric to prevent service interruptions!²

These are only a few of the available expository metrics to peruse. Access to more, such as broker topic bytes in and out, CPU usage, disk read and write usage, producer requests and total time to service a request, will also aid in creating a snapshot of how well your cluster is doing. All of these metrics are displayed in graphs showing changes in real time just like your live data streams!

By intimately understanding an Apache Kafka cluster with Klustr, users can expect accurate, meaningful metrics to keep track of system health. They can easily see troublesome states and leverage this information to ensure optimal performance.

Please give Klustr a try and let us know of any features you’d like to see us implement! We look forward to receiving feedback! To download the app or to contribute to this open-source project, please head on over to Klustr on Github! If you like this article, please throw some 👏 our way and check out klustr.app for more information.

Klustr Engineers:
Shah Chaudri, Github & LinkedIn
Paul Kim, Github & LinkedIn
Eric Tacher, Github & LinkedIn
Cris Newsome, Github & LinkedIn
Katrina Villanueva, Github & LinkedIn

[1]: Apache. Apache Kafka Introduction. https://kafka.apache.org/intro

[2]: Datadog. Monitoring Kafka performance metrics. https://www.datadoghq.com/blog/monitoring-kafka-performance-metrics/

--

--