Building Multi-Datacenter Applications using Kafka

Satendra Tiwari
Hackmamba
Published in
4 min readAug 19, 2017

In this article, I will explain how Kafka supports horizontal scalability, dynamically scaling your application and how you can configure Kafka so that your application can survive failure of one of the data centers in a multi DC environment.

Support for horizontal scalability in Kafka
Horizontal scalability means that we can scale the system by adding more machines. I am assuming the reader of this article has basic understanding about Kafka and the related keywords.

To understand horizontal scalability in Kafka, lets assume that we have two brokers. We have a topic and it is divided into six partitions. Broker1 is the leader for partition 1, 3 and 5 and Broker2 is the leader for partition 2, 4 and 6. Each broker is leader for a partition and follower for two partitions.

Broker1 -> P1, P2, P3, P4, P5, P6
Broker2 -> P1, P2, P3, P4, P5, P6

In Kafka, each partition is replicated from a leader to a follower. All reads and writes go through the partition leader. In case of a write, the leader forwards the write on to its followers.

Now, assume that two brokers are not able to handle your workload so you decided to add another broker. When we add broker, partitions will be redistributed by Kafka (but this needs to be manually initiated).

Broker1 -> P1, P2, P5, P6
Broker2 -> P2, P3, P4, P6
Broker3 -> P1, P3, P4, P5

So now we will have only four partitions per broker compared to six partitions per broker previously. Also, now each broker is leader for one partition and follower for two partitions.

Same cluster with six brokers:

Broker1 -> P1, P2
Broker2 -> P2, P3
Broker3 -> P3, P4
Broker4 -> P4, P5
Broker5 -> P5, P6
Broker6 -> P6, P1

Now, each broker is leader for one partition and follower for one partition. We can see that if we have six partitions, we can scale only upto twelve brokers where each broker will be either be follower or leader for a partition. Having very large number of partitions is not recommended and the cost of the same is out of the scope of this article.

Dynamically scaling your application with Kafka

Kafka also has support for increasing the number of consumers. We can increase the consumers to scale our application. Assume that we have a topic with six partitions and it is consumed by two consumers.

C1 -> P1, P3, P5
C2 -> P2, P4, P6

As two consumers can not scale so we decide to add another consumer. In such an event, Kafka will automatically reassign the partitions such that they are evenly distributed across the three instances. This is called consumer rebalance.

C1 -> P1, P2
C2 -> P3, P4
C3 -> P5, P6

Here, we should notice that the number of consumers can only be scaled upto six (number of partitions). Kafka also tracks the health of consumers through heartbeat and reassigns the partitions if one of the consumers become unavailable.

Configuring Kafka to handle datacenter failures

We want to configure Kafka so that even if one of the data centers has outage, it does not impact the uptime of our service.

Lets assume that we have six brokers evenly split across three data centers. We have a single topic with six partitions and three replicas per partition.

DC1 -> B1 (P1, P2, P3), B2 (P4, P5, P6)
DC2 -> B3 (P4, P5, P6), B4 (P1, P2, P3)
DC3 -> B5 (P1, P2, P3), B6 (P4, P5, P6)

To survive outage of one data center, we have to ensure that we replicate to two data centers before we acknowledge a successful write. This can be done with: acks = all, min.insync.replicas = 2

It means that each message will be successfully written to at least two replicas. Now lets assume that DC3 becomes unavailable.

DC1 -> B1 (P1, P2, P3), B2 (P4, P5, P6)
DC2 -> B3 (P4, P5, P6), B4 (P1, P2, P3)

P3 and P6 had leaders in DC3 which is now not available. Partition leadership fails over automatically. If a DC is down, Kafka will change the leader to one of the in-sync followers. The reads and writes will go to the new leader. Kafka will make B1 and B3 take leadership of P3 and P6. Now we have two in-sync replicas for each partition. As two is the minimum number of replicas in our config so the write will still succeed.

Thus, even when one datacenter is down, the service will be available.

--

--