Moving a Busy Kafka Cluster (Part 1)

Ralph Brendler
project44 TechBlog
Published in
4 min readOct 5, 2022

At project44, we use Kafka for lots of things: logging, pipelines, event ingestion, streaming APIs, asynchronous job processing, and much more. We manage clusters ranging in size from 3 to 9 nodes, and our busiest stacks are processing 3–5 TB of data per day.

That’s a LOT of data, and it comes in 24/7/365.

When the Strimzi Kafka operator was introduced, we were really excited. The ability to have a rock-solid Kafka infrastructure backed by Kubernetes would not only reduce the demands on our DevOps team, it would make complex Kafka upgrades and configuration changes fast and simple.

The only question was: How can we safely migrate a busy Kafka cluster and dozens of Kafka-based services to an entirely new stack without losing messages, violating order guarantees, or downtime?

The Usual Approach

Normally, when you are moving a Kafka cluster onto new hardware, the approach is to use replication to move the data. This would look something like this:

  • Add a new broker to the cluster, hosted on the new hardware
  • Use the Kafka CLI tools to move the replicas from one of the old brokers to the new broker
  • Once all replicas have been moved, decommission the old broker and do a rolling restart of the cluster
  • Repeat for the next broker until all brokers are on the new hardware

It’s a time-consuming process, but fairly straightforward and easily scriptable.

And it won’t work at all for moving to Strimzi.

The Strimzi Approach — XDCR

Since Strimzi manages the configuration of the Kafka cluster, the usual trick of expanding and contracting the cluster is not really an option. There was no way to for us to inject nodes from an outside cluster into Strimzi, or vice-versa. We needed an approach that would allow us to run multiple Kafka clusters that acted as if they were a single cluster.

The solution we came up with was to use a fairly recent feature added to Kafka — Cross Data Center Replication (XDCR for short). The latest version of the Kafka MirrorMaker (MirrorMaker2 a.k.a. MM2) provides full support for XDCR, and Strimzi allows easy deployment and configuration of MM2 through its operators.

With XDCR, a cluster can define a list of topics that should be replicated to a different cluster (the list can be either a whitelist or blacklist, and can use regular expressions). MM2 will forward any messages sent to these topics to the other cluster. To identify the replicated data, MM2 will create a new topic with a cluster-specific prefix for each shared topics.

Configuring XDCR

In our case we will tag our original cluster’s topics as “OLD.*”, and define our MM2 replication rules such that messages sent to any topic on this cluster get forwarded to the new Strimzi cluster. The Kubernetes deployment manifest defines a Strimzi KafkaMirrorMaker2 resource with the following stanza:

clusters:
— alias: “OLD”
bootstrapServers: {old cluster address}
— alias: “NEW”
bootstrapServers: {Strimzi cluster address}
mirrors:
— sourceCluster: “OLD”
targetCluster: “NEW”
topicsPattern: “.*”
sourceConnector:
tasksMax: 64

This setup will forward any messages sent to topics on the old cluster to the new cluster with the prefix OLD. added to the topic name. For example, a message sent to topic.name on the old cluster will be replicated to the new cluster on topic OLD.topic.name.

When this configuration is deployed, a Kafka Connect instance is created that will begin to move ALL data from the old cluster to the new. This will obviously put some stress on the Kafka cluster and use up a lot of network bandwidth, so we performed a series of experiments to determine how aggressively we could move the data without affecting the consumers of the cluster.

We eventually settled on using 64 tasks. This increased the CPU load on the old cluster’s brokers by about 15%, and the network I/O by about 40% (both fairly reasonable numbers), and did not measurably slow down the existing cluster’s response times.

Using this configuration the initial sync took several hours, after which the load on the old cluster was insignificant.

Once the cluster was synced, we measured the replication delay on the new cluster, and found it quite manageable — even under heavy usage, the new cluster was consistent within a few seconds of the old cluster.

Net Result

With the new mirroring configuration, the server side of our Kafka infrastructure migration was complete. We were running two full Kafka clusters, and mirroring all traffic from the old cluster to the new.

In the next installment, we talk about the second half of the migration — updating the clients to use the new infrastructure.

--

--

Ralph Brendler
project44 TechBlog

Principal software engineer with a long history of startup work. Primary focus is currently on scalability and distributed computing.