How to Deploy Apache Pulsar Cluster in Kubernetes.

No war in Ukraine. Stop Russia!

oleksii_y
4 min readApr 5, 2019

Preface.

Apache Pulsar is an open-source distributed pub-sub messaging system originally created at Yahoo and now part of the Apache Software Foundation.

Pulsar is a multi-tenant, high-performance solution for server-to-server messaging. It’s composed of a set of brokers and bookies along with an inbuilt Apache ZooKeeper for configuration and management. The bookies are from Apache BookKeeper which provide storage for the messages until they are consumed.

In this article, we will deploy an Apache Pulsar Cluster in Kubernetes. I will use Minikube, but in a production cluster process is the same. In a cluster, we’ll have:

  • Multiple cluster Brokers to handle the incoming message from producers and dispatch the message to consumers
  • Apache BookKeeper to support message persistence
  • Apache ZooKeeper to store the cluster configuration
  • Apache Pulsar Dashboard

Additionally, we will deploy the Prometheus & Grafana for monitoring purposes.

Step 1. ZooKeeper

You must deploy ZooKeeper as the first Pulsar component, as it is a dependency for the others:

zookeeper_micro.yaml

For testing purposes, I used 1 ZooKeeper, but in production, you should increase it to 3 or 5:

replicas: 1

Fill free to enlarge CPU and memory limits for ZooKeeper:

CPU and memory limits
$ kubectl apply -f zookeeper_micro.yaml

Wait until all three ZooKeeper server pods are up and have the Running status. You can check on the status of the ZooKeeper pods at any time:

$ kubectl get pods -l app=zk

This step may take several minutes, as Kubernetes needs to download the Docker image on the VMs.

Step 2. Initialize cluster metadata

Once ZooKeeper is running, you need to initialize the metadata for the Pulsar cluster in ZooKeeper. This includes system metadata for BookKeeper and Pulsar more broadly. There is a Kubernetes job in the cluster-metadata.yaml file

cluster-metadata.yaml

You only need to run it once:

$ kubectl apply -f cluster-metadata.yaml

Step 3. Deploy the Bookies

Apache BookKeeper is a scalable, low-latency persistent log storage service that Pulsar uses to store data. We will use the Kubernetes DaemonSet:

$ kubectl apply -f bookie.yaml

Output:

Step 4. Brokers Deployment.

Config file for brokers deployment:

For testing purposes, I used 1 Pulsar broker, feel free to increase it in production.

$ kubectl apply -f broker.yaml

Step 5. Pulsar Dashboard Deployment.

The Pulsar dashboard is a web application that enables users to monitor current stats for all topics in tabular form. The dashboard is a data collector that polls stats from all the brokers in a Pulsar instance (across multiple clusters) and stores all the information in a PostgreSQL database. A Django web app is used to render the collected data.

Config file for Dashboard deployment:

Let’s deploy it:

$ kubectl apply -f pulsar-dashboard.yaml

Wait while Pulsar Dashboard will be deployed and access it going to http://minikube_ip:30005

Pulsar Dashboard

You can find all necessary files in my tutorial repo.

Thank you for reading!

--

--