How to set up Kafka in multi-node environment?

4 min readFeb 14, 2023

Prerequisites:
Basic understanding of kafka including topics, producer and consumer.

For high availability of the Kafka service, we need to setup Kafka in cluster mode.

What is a cluster?
Clusters are typically defined as collections or groups of items with similar or different characteristics. The group or collection of items constitutes a cluster.

For this article, I have taken an example of one Zookeeper server with 3 Kafka brokers.

Steps to configure brokers:
1. Download kafka from the Apache’s site:

Apache Kafka

3.4.0 is the latest release. The current stable version is 3.4.0 You can verify your download by following these…

kafka.apache.org

Please ensure that you download the binaries and not the source files.

2. Setting up the first broker:

To set the first broker, unzip the downloaded file. The first and foremost part is to modify the configuration. Head over to the config folder and open the server.properties file in edit mode.

Make the below suggested changes in the server.properties:

a. set broker.id to 0
   (We want to ensure that each broker are uniquely identified).

b. Uncomment the PLAINTEXT://localhost:9091 property. 
   The above property signifies that for connection requests, the kafka 
   broker will be listening on port 9091. I have change the port for the 
   first broker to 9091. 

c. We want to have the logs(data) of the kafka brokers in single place. 
   So, change the property log.dirs and provide a value of any directory of 
   your machine.

After following the above steps, your server.properties should have the highlighted field (please ignore if it is not in the same order or if you have additional properties as well).

That’s all!! We have configured our first broker.

3. Set up the remaining brokers:

Follow the same set of steps to configure two more brokers. The easiest way to do this is to copy and paste the entire Kafka folder twice and modify the server.properties of each (broker).

Note: Do ensure that you set the unique broker.id and port to each broker. For simplicity, I have assigned 1 and 2 as broker IDs and ports 9092 and 9093 for the other two brokers.

Now that we have our brokers all configured, lets see it in action.

4. Start kafka server

For all who are familiar, we first need to start Zookeeper.

Zookeeper is used for metadata management in the Kafka world and keeps track of which brokers are part of the Kafka cluster. Zookeeper is used by Kafka brokers to determine which broker is the leader of a given partition and topic and perform leader elections. I am sure you can find more details on this, so skipping the technicality here.

To start Zookeeper, execute the below command from the root folder of Kafka. (it should have the bin, and config folder inside it).

$ ./bin/zookeeper-server-start.sh config/zookeeper.properties

You should be able to see the Zookeeper in the console. Remember that the Zookeeper by default will be running on port 2181 (unless explicitly changed).

Lets start our brokers as well

Once zookeeper is up and running, the next step is to start each broker by repeatedly executing the command from the root folder of the three brokers.

./bin/kafka-server-start.sh config/server.properties

You should see the broker id, port and the configuration we provided in the startup logs.

5. Test the brokers

Create a sample topic for testing purposes with 10 partitions and replication factor 3.

./bin/kafka-topics.sh - create - topic sample - bootstrap-server localhost:9091 - replication-factor 3 - partitions 10

To get more details about the topic:

./bin/kafka-topics.sh --describe --topic sample --bootstrap-server localhost:9091

Common pitfalls:

Exception in thread “main” joptsimple.UnrecognizedOptionException: zookeeper is not a recognized option.

Exception in thread "main" joptsimple.UnrecognizedOptionException: zookeeper is not a recognized…

Thanks for contributing an answer to Stack Overflow! Please be sure to answer the question. Provide details and share…

stackoverflow.com

2. To understand about the internal offsets.

Offsets stored in Zookeeper or Kafka?

Older versions of Kafka (pre 0.9) store offsets in ZK only, while newer version of Kafka, by default store offsets in…