Network, how to affect the performance of Kafka Consume

Ferdi Tatlisu
Trendyol Tech
Published in
4 min readApr 28, 2023

Here we are with another article about Kafka's performance. I developed an application that searches in Topic, and I need to increase the performance of the search speed from beginning to end. There are two parameters: a search algorithm to find a specific keyword, which I won't mention in this article, and another is the consuming speed of Partitions.

In this article, I’ll use Consumer that developed by Go, and the library is segment-io. Like just the other articles. You can reach more details and source code.

Our goal is straightforward: to ensure that consumers can consume the topic as quickly and efficiently as possible. Let's begin.

Batch Consuming

Imagine that, we have a Topic that has 1.5 Billion messages, we want to consume them in one shot. Should we get messages one by one or batch by batch? So many articles say batch is better than single fetch. I agree with them. So I won't say anything about it. So what, how many batch count is better? We will try to find out about it.

We will indicate how much data we get them with these configurations in every fetch. While there are numerous configurations available, we will only be discussing two of them.

 config := kafka.ReaderConfig{
MinBytes: 1,
MaxBytes: 10e5 // 1 MB,
}

The Consumer is getting events when the Topic has messages bigger than {MinBytes} or lower than {MaxBytes}. I try to configure them depending on my Topics and see the results what happened?

Results

I share only Network and CPU metrics, not Memory because Memory really efficient in every scenario.

The consumer consistently began at the starting offset in all scenarios. The topic contains an average of 1.5 billion messages and occupies a size of approximately 450 GB. The size of every message in the Topic is an average of 500Bytes.

config := kafka.ReaderConfig{
MinBytes: 10e3, // 0.01 MB
MaxBytes: 10e4, // 0.1 MB
}
Network
CPU
  • Consumers get events 200 MB-sized every minute.
  • Most parts consumed 110 minutes, and 130 minutes to complete every message.
  • Nearly every CPU resource is used.
config := kafka.ReaderConfig{
MinBytes: 10e3, // 0.01 MB
MaxBytes: 10e5, // 1 MB
}
Network
CPU
  • Consumers get events 650 MB-sized every minute.
  • Most parts consumed 45 minutes, and 65 minutes to complete every message.
  • Nearly every CPU resource is used.
config := kafka.ReaderConfig{
MinBytes: 10e3, // 0.01 MB
MaxBytes: 10e6, // 10 MB
}
Network
CPU
  • Consumers get events 750 MB-sized every minute.
  • Most parts consumed 17 minutes, and 25 minutes to complete every message.
  • CPU usage is approximately %80.

After experimenting with various configurations, I have found that the most effective one for scenarios is the latest one. It is important to keep in mind that this outcome may vary based on different configurations and events related to your topic.

I suggest checking out an extra article on the performance of a specific layer of programming languages if that's something you're interested in.

END

I am pleased to report that I have successfully improved the response time of my Kafka-search application. Moving forward, I will continue to strive for even better results and will be sure to keep you, my valued readers, updated on my progress.

If you’re interested in joining our team, you can apply for the backend developer role or any of our current open positions.

Thank you for taking the time to read this far. I look forward to seeing you in my next article.

--

--