Setting Up a Kafka Streams Application with Docker Compose

Building Block Breakdown — Services Explained:

MehmoodGhaffarMemon
3 min readMar 14, 2024

Real-time processing in big data is nowadays becoming more important and demanding. Apache Kafka, a distributed streaming platform, empowers you to ingest, process, and store massive data streams with ease. This blog post delves into streamlining your development workflow by guiding you through setting up a Kafka Streams application using Docker Compose.

Let’s break down the magic behind the scenes with a Docker Compose configuration file (docker-compose.yml) that serves as the blueprint for your Kafka ecosystem. We’ll explain each service and its configuration, making it a breeze for you to get started, even if you’re new to Docker.

Photo by Arthur Mazi on Unsplash

Docker Compose — Orchestrating Your Kafka Playground:

Docker Compose is a fantastic tool for managing multi-container applications. Here’s how adocker-compose.yml file orchestrates the Kafka environment:

version: '3'

Docker composer file starts with a version. This line specifies the Docker Compose file format version (version 3 in this case).

2. Services — The Building Blocks:

This section defines individual services that comprise a Kafka ecosystem. Lets explore each service in detail:

Image and port is the most crucial part for each and individual service that is required to setup an ecosystem. Lets breakdown Kafka ecosystem as follows.

Zookeeper:

This is a distributed coordination service that manages the Kafka cluster.

Image: confluentinc/cp-zookeeper:latest - Here we leverage the official Confluent platform image for Zookeeper, a distributed coordination service for Kafka.

Container Name: zookeeper - A descriptive name for the container running Zookeeper.

Environment Variables:

ZOOKEEPER_CLIENT_PORT: 2181 - Defines the port exposed by Zookeeper for client communication (default is 2181).

ZOOKEEPER_TICK_TIME: 2000 - Sets the tick time for Zookeeper's internal heartbeat (default is 2000 milliseconds (2sec)).

Ports:

"2181:2181" - Maps the container's port 2181 to the host machine's port 2181, making Zookeeper accessible from your system.

Kafka Broker:

This is a server responsible for storing, managing, and serving messages within the Kafka cluster

Image: confluentinc/cp-kafka:latest - Similar to Zookeeper, we utilize the official Confluent platform image for Kafka, the distributed streaming platform at the core.

Hostname: kafka-broker-1 - Sets the hostname for the container running Kafka (optional, but improves clarity).

Ports:

"19092:19092" - Maps the container's port 19092 (Kafka's default port) to the host machine's port 19092, allowing communication with your Kafka application (default is 9092).

Depends On: zookeeper - Ensures the Kafka broker starts only after Zookeeper is up and running.

Environment Variables: These variables configure crucial aspects of the Kafka broker:

KAFKA_BROKER_ID: 1 - Assigns a unique ID for this broker (typically 1 for the first broker).

KAFKA_ZOOKEEPER_CONNECT: 'zookeeper:2181' - Specifies the Zookeeper connection string (pointing to the Zookeeper service within the Docker Compose network).

KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_INTERNAL:PLAINTEXT - Defines the listener security protocols (using PLAINTEXT for simplicity in this example).

KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka-broker-1:9092,PLAINTEXT_INTERNAL://localhost:19092 - Configures the advertised listeners for clients to connect (internal and external).

KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1 - Sets the replication factor for the Kafka offsets topic (defaults to 1 for a single broker setup).

Kafka UI:

This is optional but very handy service that helps developers to avoid using terminal based Kafka related operations such as consuming or producing messages among others.

Image: provectuslabs/kafka-ui - We'll leverage the provectuslabs/kafka-ui image to provide a user-friendly web interface for visualizing and managing your Kafka cluster.

Ports:

"8090:8080" - Maps the container's port 8090 (default for Kafka UI) to the host machine's

Conclusion: Stream Processing Made Easy

By following this guide and utilizing the provided Docker Compose configuration, you’ve successfully established a foundation for developing Kafka Streams applications. This local Kafka environment empowers you to experiment, build, and test your stream processing logic with ease. Remember to replace placeholders like 172.17.0.1 with your actual Zookeeper host IP if needed.

As you delve deeper into Kafka Streams development, explore official documentation and tutorials for advanced configurations and functionalities. The world of real-time data processing awaits!

Bonus Tip: Consider integrating a continuous integration/continuous delivery (CI/CD) pipeline to automate building, testing, and deploying your Kafka Streams applications for a streamlined development workflow.

--

--

MehmoodGhaffarMemon

Tech-savvy writer sharing knowledge on fullstack-dev, DevOps, and ethical hacking. Bringing insights and solutions to fellow tech enthusiasts