Kafka Consumer Groups Demystified

Abhinav Nath
5 min readAug 15, 2022
Photo by Sharon McCutcheon on Unsplash

What is a Consumer Group?

A consumer group is a group of multiple consumers having the same consumer group-id. Every consumer group processes every message in a topic independently from other groups.

It is the collective responsibility of a consumer group to process messages from a given topic. Each consumer within the group will read from one topic partition. Kafka balances the number of partitions across the number of available consumers in the group.

Each consumer group maintains its offset per topic partition.

Consumer groups split the processing load of a topic by sharing its partitions between consumers in a group.

Let’s try to understand more about Kafka Consumer and Consumer Group by doing a case by case analysis using a very simple Java application.

Here is the GitHub link for the full source code.

Let’s start by spinning up all the required components like: Zookeeper, Kafka and Kafdrop (UI for Kafka) using this docker-compose.yml.

Run the following command:

docker-compose up -d

Let’s now create a Producer which has a method named produceMessages()

and a Consumer which has a method named consumeMessages()

Finally, let’s create a simple Main program to play with our Producer and Consumer:

Nothing fancy so far, we have all our setup ready and we are now ready to dive into various cases.

Case 1: One Partition, One Consumer in a Consumer Group

Let’s create a Kafka topic named TestTopic1 with one partition using Kafdrop.

Create topic TestTopic1 with one partition

Now let’s run our Main program.

Here is the output:

Observation: As there is only one partition, it is assigned to the single consumer and thus ConsumerA receives all the messages.

Case 2: Two Partitions, One Consumer in a Consumer Group

Create a new topic TestTopic2 with two partitions:

Create topic TestTopic2 with two partitions

And let’s just update the topic name to TestTopic2 in our Main program:

Here is the output:

Question: Why did all messages go to the same partition (1) ?

Observation: There are two partitions and one consumer. All the messages have the same partitioning key as “apple” so all the messages go to only one of the two partitions (in this case to partition 1). And since there is only one consumer, the partition is assigned to ConsumerA hence ConsumerA receives all the messages.

Case 3: One Partition, Two Consumers in Same Consumer Group

Create a new topic TestTopic3 with one partition:

Create topic TestTopic3 with one partition

And this time, let’s spin up two Consumers (ConsumerA and ConsumerB) in the same Consumer Group (ConsumerGroup1):

Here is the output:

Question: Why didn’t ConsumerB receive any message at all?

Observation: There is one partition and two consumers in the same consumer group. This one partition would be assigned to only one consumer and the other consumer will stay idle. In this case ConsumerA received all the messages and ConsumerB remained idle.

Case 4: Two Partitions, Two Consumers in Same Consumer Group

Create a new topic TestTopic4 with two partitions:

Create topic TestTopic4 with two partitions

If the key is same, all the messages will still go to the same partition and hence one out of two consumers will get all the messages.

Therefore, let’s change the key for every message in the Producer:

Let’s update the topic name in our Main program:

Output:

This is interesting!

Observation: There are two partitions and two consumers. Each message has a different key (hence ordering is not guaranteed). Each message goes to either of two partitions (0 or 1 based on the partitioning logic). Partition 0 is assigned to ConsumerA and Partition 1 is assigned to ConsumerB. So ConsumerA reads all the messages from Partition 0 and ConsumerB reads all the messages from Partition 1.

Case 5: One Partition, Two Consumers in Different Consumer Groups

Create a new topic TestTopic5 with one partition:

Create topic TestTopic5 with one partition

This time let’s keep the same key “apple” for all the messages.

Create two consumers (ConsumerA and ConsumerB) in two different Consumer Groups (ConsumerGroup1 and ConsumerGroup2):

Output:

Observation: There is one partition and two consumers in different Consumer Groups. All the messages have the same key (“apple”). In this case all the 10 messages will be read by both ConsumerA and ConsumerB because they belong to different Consumer Groups. The ordering of the messages is guaranteed in both Consumers because the key is same for each message.

Conclusion

With these small and simple experiments, we can deduce following things:

  1. If multiple messages are sent with the same key then, ordering is guaranteed at the consumer side because all messages will be sent to the same partition.
  2. If messages have different keys then, ordering is not guaranteed at the consumer side.
  3. We can scale our message consumption by increasing our consumer instances to the number of partitions on the topic. This is the approach suggested by Kafka for doing parallel processing.
  4. If we scale our consumer instances beyond the number of partitions then the excess consumers will remain idle.
  5. Same partition will never be assigned to two active consumers belonging to the same consumer group.

Thanks for reading!

Buy me a coffee if you found this article useful :)

--

--