Kafka/Zookeeper cluster on kubernetes
Some background
Kafka is Fast
, Scalable
, Durable
, and Fault-Tolerant
publish-subscribe messaging system which can be used to real time data streaming. We can introduce Kafka as Distributed Commit Log
which follows publish subscribe architecture. Zookeeper is distributed systems configuration management tool. Kafka uses Zookeeper to mange Electing a controller
, Cluster membership
, Topic configuration
, Manage Quotas
, Access control
etc.
Kakfa and Zookeeper services can run with docker containers. You can find more information about running single node kafka and zookeeper with docker from here. In this article I will explain how to deploy multi node kafka/zookeeper cluster with docker and kubernetes. All the related source codes of this article published on gitlab.
Kubernetes setup
I have deployed two node kubernetes cluster on AWS. It contains one master node(ip — 172.31.25.198
) and one agent node(ip — 172.31.26.5
). Master node itself act as an agent node, so I can deploy pods on it. I’m gonna setup two node kafka cluster with two node zookeeper cluster on this setup.
Deploy zookeeper cluster
Kafka required zookeeper as the coordination services. First I need to deploy two node zookeeper cluster.
Zookeeper deployment
I’m gonna create create zookeeper deployment to deploy two zookeeper containers/pods zookeeper1
and zookeeper2
. Following is the zk-deployment.yaml
.
These containers expose 2181
, 2888
, 3888
ports. Clients can connects to zookeeper with 2181
port. Nodes using 2888
, 3888
ports to connect with other nodes in the cluster.ZOOKEEPER_SERVER_1
and ZOOKEEPER_SERVER_2
environment variables are using to define the hosts of zookeeper nodes in the cluster. These variables are setup with zoo1
and zoo2
which are the name of the kubernetes services. (In next section we are discussing how to create these two services). Now I can create the deployment with kubectl
.
# create deployment
kubectl create -f zk-deployment.yaml# view pods
kubectl get pods
It should deploy two zookeeper containers/pods.
Zookeeper service
Then I need to setup two kubernetes services zoo1
and zoo2
for my two zookeeper containers. Following is the zk-service.yaml
.
These services expose 2181
, 2888
, 3888
ports. Finally I’m assigning zookeeper1
pod for zoo1
service and zookeeper2
pod fo zoo2
service. Now I can create the services with kubectl
.
# create service
kubectl create -f zk-service.yaml# view services
kubects get services
It should deploy two services with zoo1
and zoo2
.
Deploy kafka cluster
Now I have deployed two nodes zookeeper cluster. Next step is to setup two node kafka cluster with connecting to the previously deployed zookeeper cluster.
Kafka deployment
First I need to create kafka deployment to deploy two kafka broker containers/pods kafka1
and kafka2
. Following is the kafka-deployment.yaml
.
Kafka container expose 9092
port for clients. It uses KAFKA_ADVERTISED_HOST_NAME
environment variable to define IP address which kafka broker is running. External clients(external to kunernets cluster) can connect to kafka via this IP address. This value is set to the kafka broker pod running host’s IP address(more info in below section). At the end it uses KAFKA_ZOOKEEPER_CONNECT
to define endpoints of zookeeper cluster(these endpoints exposes with kubernetse services). Now I can create the deployment with kubectl
.
# create deployment
kubectl create -f kafka-deployment.yaml# view pods
kubectl get pods
It should deploy two kafka containers/pods.
Kafka service
Then I need to setup two kafka services kaf1
/kaf2
and assign the kafka deployments to them. Following is the kafka-service.yaml
.
These services expose 9092
port which is the client port. Most importantly it uses externalIPs
field to define external IP addresses to this services. These IP addresses are setup with kunernetes node’s IP addresses. For an example 172.31.25.198
is the kubernetes master node’s IP address(kaf1
service is running on the master node). By using this IP addresses external client’s can connect to kafka cluster which running on kubernetes. Finally I’m assigning kafka1
pod for kaf1
service and kafka2
pod for kaf2
service. These services can deploy with kubectl
.
# create services
kubectl create -f kafka-service.yaml# view services
kubectl get services
It should create two services.
Test with kafkacat
Now everything is setup. I can create the kafka topics and work with them. To test the kafka I‘m using the kafka command line client kafkacat
. With kafkacat can I can do various operation on kafka. For an example list topics
, create topics
, create publisher/consumers
etc.
Install kafkacat
You can install kafkacat with apt
in linux or brew
on macos. I hope it’s available for other distributions as well. In this example I’m using it with linux.
# linux
sudo apt-get install kafkacat# macos
brew install kafkacat
List topics
In our example, I can connect to one of the kafka broker on 172.31.25.198
or 172.31.26.5
and list the available topics.
# command
kafkacat -L -b <kafka broker host>:<kafka broker port># example
kafkacat -L -b 172.31.25.198:9092
kafkacat -L -b 172.31.26.5:9092
In this example I’m connecting to kafka broker on 172.31.25.198
Create publisher
I’m gonna create publisher for a topic call senz
and publish the messages to that topic. If the given topic name does not exists, below command will automatically create a topic with given name.
# command
kafkacat -P -b <kafka broker host>:<kafka broker port> -t <topic># example
kafkacat -P -b 172.31.25.198:9092 -t senz
In here publisher connecting to kafka broker on 172.31.25.198
node.
Create consumer
In here I’m creating a consumers for topic senz
from kafka broker on 172.31.25.5
node.
# command
kafkacat -C -b <kafka broker host>:<kafka broker port> -t <topic># example
kafkacat -C -b 172.31.25.198:9092 -t senz
Reference
- https://medium.com/@itseranga/kafka-and-zookeeper-with-docker-65cff2c2c34f
- http://www.defuze.org/archives/351-running-a-zookeeper-and-kafka-cluster-with-kubernetes-on-aws.html
- https://www.admintome.com/blog/ultimate-guide-to-installing-kafka-docker-on-kubernetes/
- https://better-coding.com/building-apache-kafka-cluster-using-docker-compose-and-virtualbox/