How did I Scale my Kafka Consumers?

Like a Kafka producer that optimizes writes to Kafka, a consumer is used for optimal consumption of Kafka data. The primary role of a Kafka consumer is to take Kafka connection and consumer properties to read records from the appropriate Kafka broker.

Complexities of concurrent application consumption, offset management, delivery semantics, and a lot more are taken care of by Consumer APIs. Writing Kafka consumers has always been a part of developer’s hot talks, getting it done is another story and there is no apparent solution online. In this article, I’ll try to brief on how we can write simple pull based Kafka consumer and how we can scale on demand. We are going to use Docker container service for this one.


· Sound Knowledge of Docker.

· Sound Knowledge of Kafka Producer and Consumer concepts.

· Already configured Kafka cluster.

· In-case if you’re using confluent Kafka you can configure the number of partition from their confluent control centre.

· If you’re using OEM Apache Kafka you may need to configure the partition at the time of creation of the topic.


Kafka consumer could be anything, it can be used to ingest data to database, or a downstream to another micro-service or a notification system. The way we scale these consumer is really important, it depends on the number of partition has been created before the topic has been put on use.


I followed the extremely straight forward formula to make sure consumers are scaled and no unnecessary consumers are idle.

P is number of partitions in the topic.

C is the number of consumers listening to P partitions.

Best practice is to maintain the equilibrium of partitions and consumers.


Tnc>=Tnp Where, Tnc < = Tnp+2(constant)

Tnc = Total number of consumers

Tnp = Total number of partitions

You can always have total consumers greater than total partition. But, make sure higher the count of consumer chances are those would be idle as long as other consumers does not go down and partition reallocation takes place.

Once the consumer is ready deploy the container as a service using following command.

docker service create –-env <env_file> image_name.

Once the service is created now you can scale based on the number of partitions. Make sure the number of instance should be equal or greater than the partition itself, or else backpressure will build up where one or the other consumer is overloaded with the message. The best practice is to use non-keyed messages. The messages will be allocated in round robin fashion across all partitions. Finally, use the following command to scale the consumer instance.

docker service scale service_id=<num_of_instance>

Once the services are scaled now you can actually see the logs of each instance running and once you send the message each replica will consume message from the partitions they were allocated.

In the end, this approach really performed phenomenally well. The added benefit you get from docker is it is immune to replica failure in case your consumer goes down it might spin up again and avoid unnecessary rebalancing; it depends what happens first. In my next trial, I would be testing this approach on a much larger machine with 100s of topics in place. Feel free to leave your comments down in the comment section.

He’s a DevOps engineer have a keen interest in Business Analysis and UX. In the meantime, he spends quality time cultivating his hobbies such as making music.