Introduction to Apache Kafka

Published in

ShoutLoudz

5 min readMay 6, 2024

Kafka Series — Photo by the blowup on Unsplash

Have you heard of Apache Kafka? This powerful tool plays a crucial role in building event-driven microservice architectures. Throughout this blog series, we’ll delve into Kafka’s functionalities and explore how it empowers developers to create scalable and efficient microservices. The following topics will be covered in this series:

1. Introduction to Apache Kafka

2. Setting Up Apache Kafka on macOS

3. Core Concepts of Apache Kafka

Deep dive into fundamental Kafka concepts like:
Topics (streams of data)
Partitions (division of topics for scalability)
Offsets (consumer tracking of message position)
In-Sync Replicas (copies of partitions for fault tolerance)
Consumers (applications receiving messages)
Consumer Groups (consumers working collaboratively)

4. Working with Apache Kafka CLI

This post will focus on using the Kafka command-line interface for:
Topic management
Producer operations (publishing messages)
Consumer operations (consuming messages)

5. Building Spring Boot Microservices as Kafka Producers

Demonstrate how to develop Spring Boot microservices that publish events onto Kafka topics.

6. Building Spring Boot Applications as Kafka Consumers

Showcase how to create Spring Boot applications that consume messages from specific Kafka topics.

Introduction

Definition of Apache Kafka

Apache Kafka is an open-source event-streaming platform, It is used for ingesting, processing, storing, and publishing real-time streams of data, known as events.

You can think of Kafka as a log collector, like when we write a log in a file it just appends after the previous log and each log has some key which uniquely identifies it. And the log messages are permanently stored in that file. Log message is a kind of Key value pair.

Same way Kafka stores events as records and each record is stored permanently in Kafka. Each message in Kafka has a key as well as a Value.

What is an Event

An event is any state change of an application like placing an order, creating a new product, or Removing a product, these all are events. These Events have some data associated with them, Anything that can be serialised like(String, JSON, Avro etc) can be event data. These events are published from producer to Kafka topic, so consider keeping the size of the event as small as possible because they will travel over the network.

Message vs Event

Messages: Imagine messages as envelopes. They represent the complete package delivered through Kafka. The message will contain a key, Value(known as Event) Headers & Timestamp).
Events: Think of events as the information carried within the message (the letter inside the envelope). They represent a specific occurrence or state change within your system.

In essence, a Kafka message encapsulates the event and its associated metadata.

Use Cases

Apache Kafka has numerous use cases few of them is listed below:

Kafka works as a Pub-Sub system(which means it follows the publisher and subscriber model), which is used in developing event-driven applications. Producer application publishes the event into the kafka and consumer applications will consume the data. It helps in developing more decoupled applications which can be scaled independently.
Real-Time Data Pipelines: These are the systems designed to process and analyze data as it is generated. Providing immediate insights and responses to events.
Stream Processing: Stream processing applications in Kafka involve the real-time processing of continuous streams of data, enabling organisations to derive insights, perform transformations, and trigger actions as data flows through the system

Developing Event-Driven Microservices using Spring Boot and Kafka

Microservice

A microservice is a small independent application which is designed to perform one independent business logic(like search, notification, order placement etc.), and it can be deployed and scaled independently. For each task, there will be a separate application and they will be loosely coupled.

Microservices are formed by splitting the monolith application into multiple parts because monolith has all the functionality in one big application, And if there is any challenge in one feature all other features will also go down because of one deployment. To overcome this problem microservice architecture came into the picture.

In an e-commerce application, there will be multiple domains like User, Order, Product , Notification, Search etc. If it is a monolith and you want to change some logic of the User domain, you need to rebuild and redeploy the entire application altogether. but if it is a microservice there will be a separate application for each domain.
Each Microservice has its own database as well. It is known as a Database per service pattern.

Microservice Communication

Now since each service is an independent unit, they need to communicate with each other for getting the data of each other. One common way is sending an HTTP request.

When microservices interact, they can choose between two communication styles:

Synchronous Communication:

The calling service sends a request and waits for a response from the target service before proceeding.
This approach is simpler to implement but can lead to performance bottlenecks if the response time is high.

Asynchronous Communication:

The calling service sends a request but does not wait for a response.
It can immediately begin processing other tasks while the target service handles the request.
The response, when available, is delivered asynchronously (often via a callback or message queue).
This approach improves responsiveness and scalability for complex interactions.

Challenges of Service-to-Service Communication with HTTP Requests:

Limited Scalability: Sending multiple HTTP requests for complex interactions can become unwieldy and difficult to manage.
Single Point of Failure: If one service is down, the entire communication flow can break, leading to data loss.
Tight Coupling: Code changes are required whenever new services are added or communication patterns evolve, hindering agility.

To overcome these problems Kafka is used.

Event-driven Architecture

Microservice that needs to publish messages to multiple microservices will publish the message to a Kafka topic and other microservices will consume the message from that topic. So this is how Kafka solved the problem of microservice communication. The Publisher service is known as Producer. The application that consumed that message is known as a Consumer. Producer publishes an Event and Consumers consume that Event. This type of architecture is known as Event-driven Architecture.

Spring Boot provides good support for developing event-driven microservices, In the subsequent blogs we will be using Spring Boot and Kafka for developing event-driven Microservices.

Thanks for reading,