Apache Kafka Guide #52 Kafka Connect: Distributed Mode Rebalance

Paul Ravvich
Apache Kafka At the Gates of Mastery
3 min readMay 28, 2024
Apache Kafka Guide #52 Kafka Connect: Distributed Mode Rebalance

Hi, this is Paul, and welcome to the #52 part of my Apache Kafka guide. Today we will discuss Kafka Connect and Distributed Modes Rebalance.

Here’s an example of what it looks like for this distributed architecture with many details.

Here is your Kafka Connect cluster. We have multiple workers.

We have four workers here. Remember, each worker is a process, and usually, that process is an entire server. So, in this case, we have four workers, four servers, and four processes.

Here is my first connector. It consists of three tasks: task one, task two, and task three. As you can see, these tasks are distributed among workers: workers one, three, and four. This is just an example, as the actual distribution can vary in real scenarios.

Next, we have our second connector. It could be another source connector, and it has two tasks: task one and task two.

Finally, here is our third connector. It could be a sink connector and it comprises four tasks. These tasks are executed on workers one, two, three, and four.

When a situation arises where your fourth worker dies, the server crashes, and you lose its network connection. The exact cause may be unclear, but the worker is lost. In such cases, a rebalancing occurs, similar to the rebalancing that happens when a consumer dies. In a Kafka Connect cluster, if a worker dies, a rebalance is triggered within the consumer group. Let’s examine what happens next.

Rebalance

Our connection number one, task number three, has been moved from worker four to worker two. Similarly, connector number three, task number four, has been moved to worker one. As you can see, worker one is now handling two tasks related to connector three. This is acceptable, as multiple tasks of the same connector can be assigned to the same worker.

In a distributed architecture, if a worker fails, a rebalance occurs. This rebalance allows the cluster to continue functioning as intended, ensuring fault tolerance. If you lose a server or even an entire rack of servers, the Kafka Connect cluster will rebalance, and operations will proceed smoothly. This demonstrates the advantage of running in distributed mode over standalone mode. In standalone mode, losing a server means losing the task it was handling, with no recovery. In contrast, a distributed Kafka Connect cluster allows other servers to take over tasks from a failed server, maintaining continuous operation.

Thank you for reading until the end. Before you go:

Paul Ravvich

--

--

Paul Ravvich
Apache Kafka At the Gates of Mastery

Software Engineer with over 10 years of XP. Join me for tips on Programming, System Design, and productivity in tech! New articles every Tuesday and Thursday!