Making waves with Kafka Sonar: Leveraging Docker to streamline Kafka cluster monitoring and analysis
Co-authored by Michael Way, Steven Kim, and Kareem Saleh
Our team are thrilled to announce the launch of Kafka Sonar, the first-and-only Docker Desktop Extension built to enhance Kafka dev experience. For developers monitoring or testing their Kafka clusters, Sonar offers an overview of cluster health via visualizations of 20 essential metrics, and archives the metrics for mid- or post-run retrieval and analysis.
If you’re already sold and keen to get started with Sonar, click here to open Docker Desktop and preview / install the Extension. The rest of this article will focus on elaborating the challenges Sonar was built to address, how to use it, and its roadmap for future development and adoption.
First, what’s Kafka?
While most businesses start up as monoliths, they wholly or partially adopt microservice architectures as they scale. Message queues facilitating inter-service messaging become bottlenecks for companies when traffic spikes.
“More than 80% of all Fortune 100 companies trust, and use Kafka.” — Apache Kafka Foundation
Kafka (created by LinkedIn) solves the scaling and reliability issues that plague message queues. It’s a message broker built to intake, commit, and persist Big Data. While queues consume real-time messages linearly and do not retain data post-processing, Kafka publishes consumed messages onto partitions, or logs. Message writes are replicated across multiple partitions, and partition data is saved on hard storage. Services that read / consume committed messages are subscribed to one or multiple partitions. This pub-sub model enables high throughput (1–2 million messages / second) and asynchronous (read: non-blocking) microservices communications.
Challenges with Kafka
Kafka is a great solution for scaling event-driven architectures. It’s also undeniably complex, lending to a steep learning curve for engineers.
Systems utilizing Kafka are custom-designed for their business’ use case(s). Since partitions can number in the millions plus at enterprise-scale, they are organized into clusters, with brokers orchestrating how messages / events are logged on and replicated across partitions in order to distribute load. Maintenance overhead to debug cluster issues when they arise and ensure ongoing system health is costly.
Complexity difficulties compound with the lack of transparency. Though Kafka offers a CLI, it’s headless. There is no standardized GUI available to visualize clusters, let alone a system of clusters.
Why did we build Sonar?
Countless open source and paid (ie. Confluent) platforms have been built to address the above issues. We realize there is no one-size-fits-all solution for every business’ Kafka needs. Sonar’s aim is not to visualize your cluster or automate maintenance tasks, but to display cluster activity over time and offer analytical and eventually also debugging capabilities we’ve not seen in established platforms.
Real quick, what’s Docker?
Docker automates distributed deployment of software applications. An app is versioned and shipped with all its necessary dependencies and runtime environment configuration as a portable artifact called an image, which can be pulled from Docker Hub and run in isolation, on any OS, as a lightweight container.
Business applications comprise numerous services. Containerized services can be composed into a network and configured to communicate with each other, lending to a microservices-like model of app design and deployment that (along with cross-OS compatibility) has gained Docker billions of users and a massive industry following.
Why build Sonar as a Docker Desktop Extension?
In May 2022 Docker launched its Desktop Extensions Marketplace, allowing pioneering developers to custom augment the Docker ecosystem. You can search, install, and run an Extension image as a container from Desktop.
Using Docker, Kafka brokers (ie. servers) can be individually containerized, composed up into a network, and configured to listen to other brokers as well as non-Kafka system components, creating a message broker-model microservices architecture.
Building Sonar as an Extension meant we could containerize it alongside the containerized Kafka cluster it would connect to, so a dev would be able to monitor and troubleshoot their clusters entirely from Docker Desktop. Additionally, we could easily deploy Sonar to Docker’s massive user base.
Sonar in action
Sonar visualizes activity of containerized Kafka clusters set up to expose JMX data. To begin, enter your cluster’s information into the Extension UI’s “Add A Connection” stepper flow. Your connection info will be saved under your account and displayed in an interactive grid on the homepage.
To connect Sonar to your newly added cluster, ensure it is running, select its corresponding row in the “Saved Connections” grid, and click “Connect to Selected Cluster”. Et voilà! The connection state will update from “Sonar is disconnected” to “Sonar is emitting metrics for the client [cluster’s client ID].” You will be taken to the Current Run Metrics screen where your data will start to render. When done testing, return to the “Saved Connections” page and click “Disconnect Running Cluster”.
You can download metrics for past runs of a cluster and even up-to-the-minute metrics for a currently running cluster connection. On the “Saved Connections” grid, select the cluster of interest and click “Download Latest Metrics”. You will receive a CSV of all metrics for all runs.
Sonar requires new users to sign up for an account. We ask for a username, not email, because we don’t advertise or market any other product through the Extension. Sonar saves the configuration information you provide for each cluster to facilitate rapid re-connection for re-tests. Sonar also saves run metrics. You can delete a saved connection and its associated historic metrics at any time. Your account credentials and Kafka creds and metrics are stored exclusively in a containerized database on your machine. Sonar does not externally transmit any sensitive data. You have complete control.
You can find step-by-step technical instructions to get up and running here.
When a dev runs a containerized cluster, in Docker Desktop’s Containers tab, they can select any running broker and inspect its Logs. Desktop does not aggregate multiple container logs or distinguish stderr from stdout. A data-driven dev will be hard-pressed to efficiently isolate useful Kafka analytics or parse errors from all event outputs, even divided up by broker.
In Sonar, a dev can monitor overall Kafka cluster health in real-time with numeric or time series visualizations. Current metrics include total and failed broker counts, message throughput, request times, total and failed partition counts, and local machine resource usage.
Docker could store activity (ie. the non-numeric broker-by-broker logs) for a running cluster in volumes somewhere on the dev’s machine. Even if the logs were useful, the dev would have to find the volumes and manually disseminate the logs to their team. Sonar, however, stores the scraped run datasets to a Timescale database under the dev’s account. Any team member with access to the account where the run cluster is stored will be able to download a CSV of past run metrics for that cluster.
Though we are confident Sonar’s cluster activity visualizations and metrics download function form a solid foundation, we have yet to incorporate aids for debugging. Per early feedback, additions we’re considering include:
- Metrics highlighting replication state (a key indicator of cluster data integrity), such as out-of-sync replicas and average replication factor
- Presenting useful partition metadata (leader, replicas, in-sync replicas, human-readable errors to diagnose why certain partitions failed)
- Providing consumer group offsets, by partition, to spot consumer lags
- A console UI displaying overall Kafka cluster errors for at-a-glance discovery (as Docker displays errors within the broker container logs)
Interested in contributing?
We urge Kafka developers to utilize Sonar in their workflow and provide product feedback via GitHub issues. We encourage potential contributors to visit Sonar’s Roadmap to see a list of possible codebase enhancements and new features we’re looking to implement in the near future. Thank you in advance for your support!
- Michael Way || LinkedIn, GitHub
- Steven Kim || LinkedIn, GitHub
- Upasana Natarajan || LinkedIn, GitHub
- Kareem Saleh || LinkedIn, GitHub
Kafka Sonar is an open source product developed under tech accelerator OSLabs. It is not officially sponsored by Apache Kafka.