A beginners guide to Spring Boot + Apache Kafka

5 min readJun 8, 2019

We are living in the age of the data revolution. Have you ever thought of how the huge amount of real-time data is being processed? How does an eventing system work?

Okay, cool…it’s Apache Kafka. Don’t worry, you have come to the right place. In this Kafka tutorial, we will see what is Kafka and how to develop a spring boot application with Apache Kafka.

What is Kafka?

So Kafka can be defined as a distributed publish-subscribe messaging system which guarantees speed, scalability, and durability. It can be said that the Kafka is a system which can handle higher levels of performance just because of its unique design. So it’s basically a messaging system.

Before developing our spring boot application, we will just go through the basic terminologies in Apache Kafka.

Producer: It’s simply the application that sends the message. The message could be anything. But when it comes to the terms of Kafka, It is just an array of data.

Consumer: An application that receives the messages sent by the producer. The consumer doesn’t directly consume the messages from the producer. Instead, the producer would just send the messages to a Kafka server and consumer will consume messages from there.

Kafka Broker: It is just a name for Kafka server. The producer and consumer use Kafka broker as an agent to send and receive the messages.

Cluster: As we all know a cluster is a group of something. Here it is a group of computers, each executing one instance of Kafka broker.

Topic: As the producer is sending many messages to a Kafka server, it will be very difficult for each consumer to know which all messages they should consume. So the topic is a name for a data stream. Each consumer will listen to a particular topic and whenever there is data in that topic, the consumer will receive it.

Partition: The data that we are dealing with, can be very large. In this case, the data in each topic may be broken into partitions and they are distributed to different brokers across the clusters.

Offset: It is a sequence number assigned to each message arriving into the Kafka server. Offsets are not globally assigned, rather it is assigned locally in partitions.

Consumer groups: A group of consumers acting as a single unit. We can use partitioning and consumer groups as tools for scaling the application.

Okay, enough theory. Now we can jump into our spring boot + Kafka application. Come, let’s gets our hands dirty.

First, we need to create a producer application. Go to start.spring.io and create a new spring boot project with Kafka dependency.

create a controller package and write an API for publishing the messages.

Now, create a service package for sending messages to a topic.

Alright!! We have now created the topic and are ready to send the message. Before that, we have to write some configuration regarding the serialization of messages. So, create a config package and follow the code below to implement it.

Here bootstrap server is our local machine and we are hosting it on port 9092. Here we also define our serialization for the key (topic)and value (message).

Apart from this change the port for running the producer spring boot application in the application.properties file to 8082.

Now, we need to create our consumer application. Start a spring boot project with Kafka dependency.

Create a service class for listening to the messages sent from the Kaka producer.

Here we should give the same topic name that we have created in the producer application. And group_id is the consumer group id. Consumers can join the same group_id and thus forms a consumer group.

Now, we want to create a config class for consumer application also.

Apart from this change the port for running the consumer spring boot application in the application.properties file to 8083.

Finally, we have completed both producer and consumer applications. Before running the applications, we need to set up the Kafka server on our local machine.

We will follow the official documentation of Apache Kafka. Go to this page, https://kafka.apache.org/downloads and download the latest stable release (2.2.1 was the stable release when I wrote this article).

After downloading, Extract the file and open the folder and create a data folder inside it. Inside the data folder create two more subfolders named Kafka and zookeeper. These two folders are for storing the logs generated while we start Kafka and zookeeper. Kafka uses zookeeper to manage the clusters. ZooKeeper is used to coordinate the brokers/cluster topology.

After creating these folders, we need to point these locations in server and zookeeper property file. Open the config folder and open zookeeper.properties, here point the location of zookeeper folder that we created early to the dataDir field. Similarly, inside the server.properties file, point the location of Kafka folder which we have created early to the log.dirs field.

Now we are ready to start the zookeeper and Kafka server. In this example, Kafka will use the local machine as the server.

Open the command prompt and change the location to the extracted folder.

bin\windows\zookeeper-server-start.bat config\zookeeper.properties
bin\windows\kafka-server-start.bat config\server.properties

Use these commands to start zookeeper and Kafka server.

Try hitting the API with the postman and don’t forget to run our producer and consumer applications 😜

Please look into my GitHub repo for the full implementation of the applications and feel free to contribute to the repo.

https://github.com/AlfredSkaria/kafka

A beginners guide to Spring Boot + Apache Kafka

Written by Alfred Skaria