Mindfulness before Kafka
--
Everyone uses Kafka, Let’s use it!
If you have any ideas for using Kafka as part of your stack, please don’t just use it because “other company” is using it so you have to use it too. And also if you successfully run Kafka on your local computer doesn’t mean it will be the same in the Development or even the Production environment.
You need to think about it for “Production” use cases and environment, start with what if? who’s doing it? Implication?
There are a lot of articles about Kafka’s do’s and don’t you can read, if your use cases fit with the uses of Kafka then Great!
So the next question will be how big the data goes through Kafka, if it’s the data too tiny ask your self “is there another way to do it without Kafka?”, and if the answer is yes you better do it without Kafka. It will not be worth the hassle.
But if your use cases are a perfect fit and you will have a lot of data going through Kafka is Great! And next question will be “How’s the sizing of your Kafka cluster will be?” you need to start planning the sizing of your CPUs, RAM, and Storage capacity.
Few mistakes example
Kafka has a lot of configurations, from brokers, producers, consumers, and topics config need to be set accordingly so it can run and serve you better. Below are a few among many examples.
For example, for start it’s easier if we just set auto.create.topics.enable to a true value so we can immediately start producers to push the data to Kafka without needing it to check if the topic exists and if not exist the producers need to create topic and then can start pushing the data to Kafka. This is a very basic setup for you to start exercising from the start because in the production environment you need to disable auto.create.topics.enable and even add ACL on top of it for authentication and roles management.
The next example is topic partitions, in Kafka broker, you can set many partitions for one topic. Unfortunately for this topic partitions never will be a simple explanation, because there are a lot of use cases out there. A simple one that I can come up with is never to use too many partitions in one topic because a big partition number can have an impact on the producer’s throughput and memory usage.
The next one is the retention period, the default setting is a week. But do you need to keep the topic data that long? Or do you need more retention? You need to set this up on topic-level config so each topic has a different configuration. This retention period is related to the storage you have for the brokers.
Monitoring? For what? It’s already worked
A Kafka cluster is a distributed system and there is a lot of possibilities to go wrong with it. In my personal opinion, Observability for the Kafka cluster is a must.
To start, you need to monitor the metrics of the Kafka cluster you can start reading for JMX and Kafka Exporter. It is essential to have tools for this effort, there is a lot of Kafka monitoring tools out there you can choose from the open-source license to enterprise depending on your needs and budget.
You can monitor the health of your cluster, consumer lagging, throughput, and many more useful detail you can take advantage of. You can set an alert if the broker is down or low on storage.
What I like and useful to me is always monitoring the throughput because sometimes you rush into topic creation without having a proper configuration and it always affects the performance of your Kafka cluster. When you exercise this on the dev environment you can always tune the topic configuration to a better one before you set the config as production ready.
Let’s upgrade the version! It better!
It’s common thing for us to try out the latest version of the software. But if the version jumps from 1 to 2 or 2 to 3 there is always a fundamental change. And of course, the step for installing might be the same but for upgrading might be different.
One thing worth checking is always the data structure. Is it affected? Or even they upgraded something so the data need to be converted in some way to be readable in the new version?
Kafka upgrades usually involving a rolling restart of the broker, when this happens it will affecting the producers and consumers that connect to the cluster without the bootstrap server list (only one IP). So you need to make sure that all the application connected to Kafka is always using the bootstrap server list.
Please use the dev environment to exercise the upgrade, but your dev environment needs to have the same configuration as the production environment. Common mistakes are many of these dev environment setup configs are different from the production. This can cause you serious trouble.
So yes upgrade is always better!
Conclusion
This article I wrote is from personal experience and yes it’s very high level. The main reason I write this up is that I have encountered a few of company using Kafka just for use cases you can achieve without using Kafka and most of them use Kafka for a very small amount of data.
And some of them use docker Kafka for production with one broker in it. Come on dudes!
So, if you need to use Kafka inside your technology stack please “Take a deep breath, calm down, relax and think it through” before you decided to use it.
There’s a lot of effort in using Kafka, especially if you decided to self-manage it. If you want to piece of mind and avoid the hassle you can choose Kafka-managed service.
Thank You!