Apache Kafka Guide #16 Partition Rebalance & Static Group Membership

Paul Ravvich
Apache Kafka At the Gates of Mastery
5 min readJan 22, 2024

--

Hi, this is Paul, and welcome to the #16 part of my Apache Kafka guide. Today we will discuss Partition Rebalance and Static Group Members.

Partition Rebalancing

Let’s discuss consumer groups and strategies for rebalancing partitions. When consumers join or leave a group, partitions are reassigned. This reassignment process is known as a rebalance. A rebalance occurs not only when consumers enter or exit a group, but also when, for instance, an administrator introduces new partitions to a topic.

Consider a scenario where a group contains three partitions and two consumers. We observe that a new consumer has joined the group. The critical question is: how are these partitions distributed among the consumers, and what are the implications of this redistribution?

Eager Rebalance

The initial concept is known as ‘Eager Rebalance,’ a common default behavior. In this process, when consumer three enters the group, all consumers halt — this is why it’s termed ‘eager.’ They relinquish their partition memberships, implying no consumer reads from any partition. Subsequently, all consumers rejoin their original group and receive new partition assignments. This means each consumer is allocated new partitions in a somewhat arbitrary manner.

However, this process has a drawback. For a brief period, the entire consumer group ceases operations, known as a ‘stop the world’ event. There’s no certainty that consumers will regain the partitions they previously held. This leads to two issues. Firstly, there might be a preference for consumers to retain their original partitions. Secondly, there may be a desire to avoid halting some consumers, especially those processing the same partition, to prevent these ‘stop the world’ events.

  • All Consumers stop.
  • All Consumers rejoin new Partitions of their own Consumer Group.
  • During this process Consumer Group not processing.

Cooperative Rebalance (Incremental Rebalance)

In Kafka, there’s a relatively new concept known as Cooperative Rebalance, also referred to as Incremental Rebalance. Unlike previous methods that reassigned all partitions among all consumers, this strategy redistributes only a small subset of partitions from one consumer to another. This allows consumers without reassigned partitions to continue processing data smoothly. The process may undergo several rounds to achieve a stable assignment, thus the term “incremental.” This approach prevents scenarios where all consumers halt data processing.

Consider an example: a consumer group has two consumers handling three partitions. When a new consumer joins, the incremental rebalance intelligently decides to revoke only partition two. Consequently, consumers one and two can persist in reading from partitions zero and one. Following this, partition two is assigned to the new consumer, enabling it to start processing data from that partition. This method is less disruptive, maintaining continuous reading from topics and reallocating just one partition to the new consumer. It enhances stability within the consumer group.

  • Reassign a subset of the partitions from one consumer to another.
  • Other consumers work parallel and processing does not stop.
  • Can be evaluated with several iterations to find stable assignment (hence “incremental”)
  • No “Stop-The-World” pause when all consumers stop processing data

How can you effectively utilize cooperative balance in Kafka Consumer? It involves choosing a partition assignment strategy. The default option was the RangeAssignor, which allocates partitions based on each topic but can result in imbalances. Another option, RoundRobin, eagerly assigns partitions across all topics in a circular manner, ensuring nearly equal partition distribution among consumers. The StickyAssignor, initially similar to RoundRobin, aims to reduce partition movement when consumers join or leave, maintaining balance with minimal shifts.

These three strategies are categorized as eager, meaning they temporarily disrupt the consumer group during execution, especially problematic in large groups due to time-consuming partition reassignments. However, Kafka now offers the CooperativeStickyAssignor. Like the StickyAssignor, it limits partition movement but supports the cooperative protocol, allowing consumers to continue processing their current partitions if not reassigned. This makes CooperativeStickyAssignor an optimal choice in my view.

In Kafka 3.0, the default setting is a combination of RangeAssignor and CooperativeStickyAssignor. By default, it utilizes RangeAssignor, but if removed, CooperativeStickyAssignor becomes active after a single update. I’ll demonstrate this shortly.

Kafka Connect and Kafka Streams users should note that cooperative rebalance is enabled by default, using StreamsPartitionAssignor. Let’s now delve into practical application.

Kafka Consumer: partition.assignment.strategy

  • RangeAssignor(Default) — per topic based (can be imbalanced)
  • RoundRobin — across all topics (optimal balance)
  • StickyAssignor — like RoundRobin, but when minimizing partition movements when consumers join/leave the group moving in optimal order to minimize changes.
  • CooperativeStickyAssignor (≥v2.4)— as StickyAssignor but supports cooperative rebalances. Consumers can continue to handle data.
properties.setProperty(
"partition.assignment.strategy",
CooperativeStickyAssignor.class.getName()
);

Static Group Membership

Alright, let’s examine one final aspect before proceeding to practice. It involves static group membership in Kafka.

In Kafka, consumer group changes trigger rebalances to ensure all partitions are processed. However, it’s possible to avoid this by using static membership. When a consumer exits and rejoins, it usually gets a new member ID, causing a reassignment. But with a group instance ID in the consumer config, the consumer becomes a static member.

Imagine consumers named Consumer 1, Consumer 2, and Consumer 3. If Consumer 3 exits as a static member.

Its partition won’t be reassigned unless it fails to rejoin within a specific session time.

Additionally, static membership benefits consumers needing a local cache by maintaining consistent partition assignments, thus avoiding cache rebuilding. Whether to use this feature depends on your specific requirements.

This prevents rebalances during short disconnections, useful in scenarios like Kubernetes deployments.

  • By default when the Consumer leaves the group, partitions are revoked and reassigned.
  • If Consumers join back take a new Member ID and new partition assignment.
  • When you specify group.instance.id it turns the Consumer into a Static Member.
properties.setProperty(
"group.instance.id",
"unique_id_for_each_consumer"
);
  • Consumer have session.timeout.ms to join back for back to the same partition without rebalance, or they will be reassigned.
properties.setProperty(
"session.timeout.ms",
"10000"
);

Thank you for reading until the end. Before you go:

--

--

Paul Ravvich
Apache Kafka At the Gates of Mastery

Software Engineer with over 10 years of XP. Join me for tips on Programming, System Design, and productivity in tech! New articles every Tuesday and Thursday!