Apache Kafka Guide #33 Producer Default Partitioner & Sticky Partitioner

Paul Ravvich
Apache Kafka At the Gates of Mastery
5 min readMar 21, 2024
Apache Kafka Guide Producer Default Partitioner & Sticky Partitioner

Hi, this is Paul, and welcome to the #33 part of my Apache Kafka guide. Today we will discuss how the Producer Default Partitioner and Sticky Partitioner work.

Producer Default Partitioner if the key is not null

Let’s summarize one important point: the default partitioner used by Kafka producers. When a key is present, the data undergoes partitioner logic, determining how a record is assigned to a particular partition. This procedure is known as Key Hashing, a method that maps a key to a specific partition.

In the case of the default Kafka partitioner, keys are hashed utilizing the murmur2 algorithm, according to a specific formula. This formula ensures that identical keys are consistently assigned to the same partition, thanks to the predictability of the Murmur algorithm and the uniformity of inputs and formulas.

However, a critical aspect to consider is the impact of increasing the number of partitions, as indicated by the ‘num partitions’ variable in the formula. Adding partitions to a topic alters the formula, disrupting the guarantee that the same key always maps to the same partition. Thus, in situations requiring additional partitions, it’s advisable to create a new topic instead.

While it’s generally unnecessary and not recommended to modify the default partitioner logic, there are exceptional cases where advanced use cases necessitate custom partitioner logic. In such instances, Kafka allows for this customization through the use of the partitioner class parameter in Kafka producers.

Producer Default Partitioner if the key is null

When the key is set to null, an intriguing optimization occurs. The default producer operates in two distinct modes. Initially, up to version Kafka 2.3, the behavior follows a round-robin pattern, which will be elaborated upon in the following slide. From Kafka version 2.4 onwards, a Sticky partitioner is utilized, details of which will be presented in a subsequent slide, after the Round Robin explanation. The core concept behind employing the Sticky partitioner is to significantly enhance performance, particularly in scenarios characterized by high throughputs and when the key is null.

Kafka ≤ v2.3 When no partition and the key is null default partition works as a round-robin

How does the Round Robin partitioner function?

Imagine sending six messages through a producer to a topic that consists of five partitions. In the scenario where you’re using an older version of Kafka, specifically versions before 2.3, the distribution mechanism for these messages follows a Round Robin approach. This entails the first message being assigned to partition one, the second message to partition two, and so forth, cycling through the partitions in sequence. Once the fifth partition receives a message, the next message cycles back to partition one and the process repeats.

This method is logical and anticipated because it ensures an equal distribution of messages across all partitions. However, this system results in the creation of more batches since each partition receives one batch per message, leading to smaller batch sizes. Consequently, the efficiency is not optimal, as the smaller batches increase the number of requests and potentially raise the latency.

Kafka ≥ v2.3 When no partition and the key is null default partition works as a sticky partitioner

In version 2.4 and later, the developers have introduced a new default partitioner known as the sticky partitioner. This change significantly boosts performance by allowing records to be grouped into batches and sent to the same partition. The sticky partitioner adheres to a specific partition until either the batch is filled or a predetermined amount of time linger.ms has passed. Following the dispatch of a batch, the targeted partition for subsequent batch changes.

The mechanism of this partitioner involves a batching process, where initially, all messages are directed to partition one. Once the batch for this partition is complete, the process shifts to partition two, and this pattern continues, potentially moving to partition three and beyond for new batches. This approach results in larger batch sizes and, consequently, reduced latency due to fewer, larger requests. It also increases the likelihood of reaching the optimal batch size.

Over time, the outcome mirrors that of Round Robin, with messages being evenly distributed across all partitions.

Performance Improvement

When examining the performance enhancements, it’s clear that the latency is significantly reduced when using the sticky partitioner compared to the default partitioner.

Latency lower

Source: Kafka Wiki/KIP-480: Sticky Partitioner

This difference becomes even more apparent when observing scenarios with a large number of partitions. For instance, considering three producers handling 10,000 messages per second in a topic that includes 125 partitions, the latency reduction is profoundly noticeable. Such improvements mark a considerable enhancement in performance.

Latency is lower for many partitions

Source: Kafka Wiki/KIP-480: Sticky Partitioner

The only requirement to achieve this benefit is to upgrade your producer clients to version 2.4 or higher.

Thank you for reading until the end. Before you go:

--

--

Paul Ravvich
Apache Kafka At the Gates of Mastery

Software Engineer with over 10 years of XP. Join me for tips on Programming, System Design, and productivity in tech! New articles every Tuesday and Thursday!