WTF: Setting up Kafka cluster using Docker Swarm

Prateek
6 min readApr 15, 2018

--

I always start my conversation with a disclaimer that I am not expert in the things mentioned in this story. This is my story of setting up Kafka cluster using Docker swarm. There might be better ways to set it up.

In my later stories, I explain

What are trying to setup using Docker swarm?

  • 3 broker Kafka cluster
  • 1 Zookeeper node ensemble

Set up Docker Swarm

  • Create 4 Ubuntu:16.04 VMs in VirtualBoxVM (node1, node3, node4 and node5)
  • Install docker in node3, node4 and node5 using these instructions.
    (We plan to create docker swarm using these three nodes)

The VMs are node1, node3, node4 and node5, as shown in the picture above. We plan to create a docker swarm using node3, node4 and node5. We created an extra node: node1, so that we can test out if we can connect to the Kafka cluster from a node outside docker swarm.

Step 1: Let’s make node3 as swarm manager

node3 > docker swarm init

Step 2: Let’s make node4 and node5 as swarm workers

node4 > docker swarm join --token <swarm-token> node3:2377
node5 > docker swarm join --token <swarm-token> node3:2377

Create Docker image for Kafka and Zookeeper

This is the content of the directory.

kafka-docker 
- Dockerfile
- kafka_2.11-1.1.0.tgz

Dockerfile looks as follows:

# Dockerfile
FROM ubuntu:16.04
# Install jdk
RUN apt-get update && apt-get install openjdk-8-jre -y
# Unzip kafka zip and rename at kafka
ENV kafka_version=2.11-1.1.0
ADD ./kafka_${kafka_version}.tgz ./
RUN mv kafka_${kafka_version} kafka

Create the docker image and push to the docker repository (private or public). Here, the image is named kafka (suggest to name it different). We will have to change the commands as per the name we provide to the image.

docker build -t kafka .

Kafka Setup

Let’s expand on the diagram:

  • Zookeeper service: zookeeper would run in node3
  • Broker service: kafka1 would run in node3
  • Broker service: kafka2 would run in node4
  • Broker service: kafka3 would run in node5

Setup labels in nodes

We label the docker swarm nodes to help target where the services needs to be installed.

node3 > docker node update --label-add zoo=1 node3
node3 > docker node update --label-add kafka=1 node3
node3 > docker node update --label-add kafka=2 node4
node3 > docker node update --label-add kafka=3 node5

Setup Kafka Cluster using docker swarm commands

First, we will attempt to setup kafka cluster using docker swarm commands, so that we can understand how do we compose them.

  • Step 1: Create a overlay network: kafka-net
node3 > docker network create --driver overlay kafka-net
  • Step 2: Create a zookeeper service
node3 > docker service create \
--name zookeeper \
--mount type=volume,source=zoo-data,destination=/tmp/zookeeper \
--publish 2181:2181 \
--network kafka-net \
--constraint node.labels.zoo==1 \
--mode global \
kafka:latest \
/kafka/bin/zookeeper-server-start.sh /kafka/config/zookeeper.properties

Let us expand on this:

# Command to create service
docker service create
# The name of the service
--name zookeeper
# Create a volume zoo-data and map to /tmp/zookeeper
# /tmp/zookeeper is configured in zookeeper.properties
--mount type=volume,source=zoo-data,destination=/tmp/zookeeper
# Publish 2181 port. This is not required.
--publish 2181:2181
# Attach to the network we had created
--network kafka-net
# deploy the service in node3 which has label zoo=1
--constraint node.labels.zoo==1
# global mode
--mode global
# image name
kafka:latest
# start zookeeper with default zookeeper properties
/kafka/bin/zookeeper-server-start.sh /kafka/config/zookeeper.properties
  • Step 3: Create broker : kafka1
node3 > docker service create \  
--name kafka1 \
--mount type=volume,source=k1-logs,destination=/tmp/kafka-logs \
--publish 9093:9093 \
--network kafka-net \
--mode global \
--constraint node.labels.kafka==1 \
kafka:latest \
/kafka/bin/kafka-server-start.sh /kafka/config/server.properties \
--override listeners=INT://:9092,EXT://0.0.0.0:9093 \
--override listener.security.protocol.map=INT:PLAINTEXT,EXT:PLAINTEXT \
--override inter.broker.listener.name=INT \
--override advertised.listeners=INT://:9092,EXT://node3:9093 \
--override zookeeper.connect=zookeeper:2181 \
--override broker.id=1

Lets expand on this:

# Command to create docker service 
docker service create
# Name of the service is kafka1
--name kafka1
# Create a volume k1-logs and map to /tmp/kafka-logs
# /tmp/kafka-logs is configured in server.properties
--mount type=volume,source=k1-logs,destination=/tmp/kafka-logs
# Publish the external kafka listener port
--publish 9093:9093

# attach the network we created
--network kafka-net
# global service
--mode global
# deploy to the node which has label kafka=1 (node3)
--constraint node.labels.kafka==1
#image name
kafka:latest
# start kafka with default server.properties
/kafka/bin/kafka-server-start.sh /kafka/config/server.properties
# override listeners in server.properties
# two listeners defined:
# INT:
# for broker to broker communication
# INT://:9092 will resolve to the <container hostname>:9092
# EXT:
# for producer and consumer communication
# EXT://0.0.0.0:9093 will listen on any network ip on port 9093
--override listeners=INT://:9092,EXT://0.0.0.0:9093
# map INT and EXT listeners to PLAINTEXT.
# This can be configured to whatever protocol we want
--override listener.security.protocol.map=INT:PLAINTEXT,EXT:PLAINTEXT
# tell kafka that use INT for inter-broker communication
--override inter.broker.listener.name=INT \
# This is advertised listeners
# INT listener:
# INT://:9092 will resolve to <container hostname>:9092
# The brokers resolve the hostname being in same docker network
# EXT listener:
# EXT://node3:9093 for outside docker network producer/consumer
--override advertised.listeners=INT://:9092,EXT://node3:9093
# Connect to zookeeper. The zookeeper matches the service name
--override zookeeper.connect=zookeeper:2181 \
# The unique broker id
--override broker.id=1
  • Step 4: Create broker: kafka2
node3 > docker service create \  
--name kafka2 \
--mount type=volume,source=k2-logs,destination=/tmp/kafka-logs \
--publish 9094:9094 \
--network kafka-net \
--mode global \
--constraint node.labels.kafka==2 \
kafka:latest \
/kafka/bin/kafka-server-start.sh /kafka/config/server.properties \
--override listeners=INT://:9092,EXT://0.0.0.0:9094 \
--override listener.security.protocol.map=INT:PLAINTEXT,EXT:PLAINTEXT \
--override inter.broker.listener.name=INT \
--override advertised.listeners=INT://:9092,EXT://node4:9094 \
--override zookeeper.connect=zookeeper:2181 \
--override broker.id=2

This is similar to kafka1, except few differences

# Service name is kafka2
--name kafka2
# volume name is k2-logs
--mount type=volume,source=k2-logs,destination=/tmp/kafka-logs
# Different port is exposed
--publish 9094:9094
# deploy kafka2 in node with label kafka=2 (node4)
--constraint node.labels.kafka==2
# Ext listener listens on 9094
--override listeners=INT://:9092,EXT://0.0.0.0:9094
# Ext is advertised as node4 for external producer/consumer
--override advertised.listeners=INT://:9092,EXT://node4:9094
# broker id is 2
--override broker.id=2
  • Step 5: Create broker: kafka3
node3 > docker service create \  
--name kafka3 \
--mount type=volume,source=k3-logs,destination=/tmp/kafka-logs \
--publish 9095:9095 \
--network kafka-net \
--mode global \
--constraint node.labels.kafka==3 \
kafka:latest \
/kafka/bin/kafka-server-start.sh /kafka/config/server.properties \
--override listeners=INT://:9092,EXT://0.0.0.0:9095 \
--override listener.security.protocol.map=INT:PLAINTEXT,EXT:PLAINTEXT \
--override inter.broker.listener.name=INT \
--override advertised.listeners=INT://:9092,EXT://node5:9095 \
--override zookeeper.connect=zookeeper:2181 \
--override broker.id=3

This is similar to kafka1, except few differences (similar to kafka2)

# Service name is kafka3
--name kafka3
# volume name is k3-logs
--mount type=volume,source=k3-logs,destination=/tmp/kafka-logs
# Different port is exposed
--publish 9095:9095
# deploy kafka3 in node with label kafka=3 (node5)
--constraint node.labels.kafka==3
# Ext listener listens on 9095
--override listeners=INT://:9092,EXT://0.0.0.0:9095
# Ext is advertised as node5 for external producer/consumer
--override advertised.listeners=INT://:9092,EXT://node5:9095
# broker id is 3
--override broker.id=3

Testing it out

  • Step 1: Create Topic test from node1
node1>  bin/kafka-topics.sh  \
--zookeeper node3:2181 \
--create \
--replication-factor 1 \
--partitions 1 \
--topic test

Since we have exposed port 2181, we can point to node3:2181

  • Step 2: Produce messages from node1
node1 > bin/kafka-console-producer.sh \
--broker-list node3:9093 \
--topic test

The producer in node1, should be able to create messages in the Kafka cluster

  • Step3 : Consume messages from node1
node1 > bin/kafka-console-consumer.sh --bootstrap-server node3:9093 --topic test --from-beginningnode1 > bin/kafka-console-consumer.sh --bootstrap-server node4:9094 --topic test --from-beginningnode1 > bin/kafka-console-consumer.sh --bootstrap-server node5:9095 --topic test --from-beginning

--

--