Deploying an Apache Pulsar Cluster for Production

Sherlock Xu
10 min readDec 11, 2022

--

In my previous blog posts, I introduced how to deploy Pulsar in containerized environments, like Kubernetes and K3s. In this article, l would like to demonstrate a more traditional way of distributed deployment of Pulsar on Linux, which can be used for simple production scenarios. I believe compared with the installation on Kubernetes using an all-in-one Helm chart, this conventional approach helps those new to Pulsar better understand how Pulsar legos are built one after another.

Before I introduce the specific installation steps, let’s take a look at the breakdown of cluster components.

  1. Pulsar brokers: process and load balance messages from publishers, distribute message to consumers, and interact with Pulsar’s configuration store. In cases where you cannot directly expose the IP addresses of brokers, you can also set up a proxy layer for them, which serves as a gateway between clients and brokers, just like my demos in other Pulsar installation blog posts.
  2. A BookKeeper cluster: contains storage servers, or bookies to provide persistence for messages sent to Pulsar brokers. This unique design that separates computing from storage is also one of the features that differentiate Pulsar from other messaging systems.
  3. A ZooKeeper cluster: stores the metadata and configuration of the Pulsar cluster and coordinates cluster tasks.

Pulsar’s binary release already contains a ready-to-use BookKeeper package and ZooKeeper package. That said, you can also use an external BookKeeper/ZooKeeper cluster. In this tutorial, I will create a separate ZooKeeper cluster, and deploy all these three components on 3 nodes.

If you have enough hardware resources, you can deploy brokers, BookKeeper, and ZooKeeper on different workers, like 3 nodes for each of them (9 in total). This allows you to better manage each individual component, while many organizations may choose to deploy them all on the same nodes for lower network latency.

Machine configurations

My hardware configurations are shown below for your reference. You can also check the Pulsar documentation to see the hardware requirements in more detail.

All 3 machines are hosted on a cloud platform with an available public IP address assigned to each of them. This allows me to directly connect to them via ssh.

Prerequisites

Before we begin, let’s do some preparation to make the installation easier.

1. Disable the firewall on all 3 machines.

systemctl stop firewalld

2. Add the IP addresses and hostnames of these three machines to their /etc/hosts files.

# Make sure you add your own machines' hostnames
172.16.0.251 node1 node1
172.16.0.250 node2 node2
172.16.0.249 node3 node3

3. Install Java on all three nodes. Each machine needs to have Java 8 or Java 11 installed.

yum install -y java-11-openjdk

4. Download the Pulsar binary package. I will use Pulsar 2.9.3 as an example in this demo. You can choose your preferred Pulsar version.

wget https://archive.apache.org/dist/pulsar/pulsar-2.9.3/apache-pulsar-2.9.3-bin.tar.gz

5. Untar the package.

tar xvzf apache-pulsar-2.9.3-bin.tar.gz

Installing and configuring ZooKeeper

As mentioned above, ZooKeeper helps store the metadata of the Pulsar cluster. We need to first get our ZooKeeper cluster ready before configuring BookKeeper and Pulsar brokers. ZooKeeper has a leadership election mechanism, so I suggest you have an odd number (2n + 1, n > 0) of ZooKeeper servers.

1. Download ZooKeeper on one of the machines.

wget https://dlcdn.apache.org/zookeeper/zookeeper-3.8.0/apache-zookeeper-3.8.0-bin.tar.gz

2. Untar it and create a ZooKeeper configuration file.

tar xvzf apache-zookeeper-3.8.0-bin.tar.gz
cd apache-zookeeper-3.8.0-bin/conf
cp zoo_sample.cfg zoo.cfg

3. Edit zoo.cfg to change the data directory, add the ZooKeeper server addresses, and set a service port for AdminServer. AdminServer is an embedded Jetty server that provides an HTTP interface to the four letter word commands. By default, it is enabled and it runs on port 8080, which will cause a port conflict as Pulsar brokers need to use port 8080. In addition to using a different port, we also have other ways to solve this problem, which will be explained later.

vi zoo.cfg
# ZooKeeper data storage directory
dataDir=/demo/zookeeper

# ZooKeeper servers
server.1=172.16.0.251:2888:3888
server.2=172.16.0.250:2888:3888
server.3=172.16.0.249:2888:3888

# Set a service port for AdminServer
admin.serverPort=8090

4. Create ZooKeeper’s data directory and myid file. The myid file only contains a single line of the machine's ID. This ID is used to identify the server that corresponds to the given data directory. The ID must be unique within the ZooKeeper ensemble and should have a value between 1 and 255.

mkdir -p /demo/zookeeper
echo 1 > /demo/zookeeper/myid # Change "1" to "2" and "3" when you run this command on the other two machines.

5. Perform the above 4 steps for the other 2 nodes.

6. In the apache-zookeeper-3.8.0-bin directory, run the following command on all 3 nodes to start the ZooKeeper service.

./bin/zkServer.sh start

7. Expected output:

/usr/bin/java
ZooKeeper JMX enabled by default
Using config: /demo/apache-zookeeper-3.8.0-bin/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED

8. Check the status of ZooKeeper on all 3 nodes. As you can see, node2 is the leader in my ZooKeeper cluster.

[sherlock@node1 apache-zookeeper-3.8.0-bin]# ./bin/zkServer.sh status
/usr/bin/java
ZooKeeper JMX enabled by default
Using config: /demo/apache-zookeeper-3.8.0-bin/bin/../conf/zoo.cfg
Client port found: 2181. Client address: localhost. Client SSL: false.
Mode: follower
[sherlock@node2 apache-zookeeper-3.8.0-bin]# ./bin/zkServer.sh status
/usr/bin/java
ZooKeeper JMX enabled by default
Using config: /demo/apache-zookeeper-3.8.0-bin/bin/../conf/zoo.cfg
Client port found: 2181. Client address: localhost. Client SSL: false.
Mode: leader
[sherlock@node3 apache-zookeeper-3.8.0-bin]# ./bin/zkServer.sh status
/usr/bin/java
ZooKeeper JMX enabled by default
Using config: /demo/apache-zookeeper-3.8.0-bin/bin/../conf/zoo.cfg
Client port found: 2181. Client address: localhost. Client SSL: false.
Mode: follower

Configuring BookKeeper

The Pulsar binary package contains a BookKeeper configuration file, which allows us to set all parameters related to BookKeeper.

1. Create a directory to store BookKeeper journals and ledgers.

mkdir -p /demo/bookkeeper

2. In the apache-pulsar-2.9.3 directory, run the following command to edit the BookKeeper configuration file.

vi conf/bookkeeper.conf

3. Change the following parameters in this file. There are many available configuration values you can set in this file. See the Pulsar documentation for details.

# The directory where BookKeeper outputs its write-ahead log (WAL).
journalDirectory=/demo/bookkeeper/journal

# A specific hostname or IP address that the bookie should use to advertise itself to clients. Change it to node2 and node3 for the other two machines.
advertisedAddress=node1

# The directory where BookKeeper outputs ledger snapshots.
ledgerDirectories=/demo/bookkeeper/ledgers

# A list of ZooKeeper nodes.
zkServers=node1:2181,node2:2181,node3:2181

4. Save the file and configure the other two bookies in the same way.

Configuring Pulsar brokers

The Pulsar binary package contains a broker configuration file, which allows us to set all parameters related to brokers.

1. In the apache-pulsar-2.9.3 directory, run the following command to edit the broker configuration file.

vi conf/broker.conf

2. Change the following parameters in this file. There are many available configuration values you can set in this file. See the Pulsar documentation for details.

# A list of ZooKeeper nodes.
zookeeperServers=node1:2181,node2:2181,node3:2181

# The configuration store quorum.
configurationStoreServers=node1:2181,node2:2181,node3:2181

# The hostname or IP address the service advertises externally. Change it to node2 and node3 for the other two machines.
advertisedAddress=node1

# The name of the Pulsar cluster. It must be consistent with the one configured when you initialize the metadata.
clusterName=pulsar-demo

3. Save the file and configure the other two brokers in the same way.

Initializing the metadata

With all the configurations ready, we can initialize our cluster data. In the apache-pulsar-2.9.3 directory, run the following command on one of the ZooKeeper nodes:

./bin/pulsar initialize-cluster-metadata \ 
--cluster pulsar-demo \
--zookeeper node2:2181 \
--configuration-store node1:2181 \
--web-service-url http://node1:8080,node2:8080,node3:8080 \
--web-service-url-tls https://node1:8443,node2:8443,node3:8443 \
--broker-service-url pulsar://node1:6650,node2:6650,node3:6650 \
--broker-service-url-tls pulsar+ssl://node1:6651,node2:6651,node3:6651
  • cluster: The name of the Pulsar cluster.
  • zookeeper: The ZooKeeper cluster address. We only need to include one ZooKeeper machine.
  • configuration-store: We can configure a global ZooKeeper cluster when we have multiple Pulsar clusters (for example, in geo-replication scenarios). For a single Pulsar cluster, we only need to set one local ZooKeeper machine for it.
  • web-service-url: The web service URL for the cluster with a port. We can use this port to manage Pulsar, like creating and deleting Pulsar topics. We can configure one or multiple addresses for this field.
  • web-service-url-tls: The web service URL for the cluster with the TLS protocol enabled.
  • broker-service-url: The broker service URL with a port. Clients use this URL to access brokers in the cluster. We can configure one or multiple addresses for this field.
  • broker-service-url-tls: The broker service URL with the TLS protocol enabled.

As I mentioned above, we need to reserve port 8080 for Pulsar brokers to avoid the port conflict with ZooKeeper’s AdminServer. To solve this problem, you can try the following:

  1. Set admin.serverPort in the zoo.cfg file with a different port number, like admin.serverPort=8090, which is what I did for this demo.
  2. Add admin.enableServer=false in the zoo.cfg file to disable it.
  3. Change the web service URL port in this step, which I do not recommend. When you use the client later to connect to the Pulsar cluster, you need to change the port in the conf/client.conf file again.

I did not know this problem when I tried to install Pulsar the first time, so I got an error Failed to bind to 0.0.0.0:8080 when I started Pulsar brokers. Hopefully, the solutions listed here can help you save some time. I also suggest you use the command netstat -anp to check port usage before you initialize your cluster metadata.

Expected output for metadata initialization:

2022-12-09T22:13:56,383+0800 [main] INFO org.apache.bookkeeper.stream.storage.impl.cluster.ZkClusterInitializer - Successfully initialized the stream cluster : 
num_storage_containers: 16

2022-12-09T22:13:56,384+0800 [Curator-Framework-0] INFO org.apache.curator.framework.imps.CuratorFrameworkImpl - backgroundOperationsLoop exiting
2022-12-09T22:13:56,505+0800 [main] INFO org.apache.zookeeper.ZooKeeper - Session: 0x200922313ee0002 closed
2022-12-09T22:13:56,505+0800 [main-EventThread] INFO org.apache.zookeeper.ClientCnxn - EventThread shut down for session: 0x200922313ee0002
2022-12-09T22:13:57,100+0800 [main] INFO org.apache.zookeeper.ZooKeeper - Session: 0x200922313ee0000 closed
2022-12-09T22:13:57,100+0800 [main-EventThread] INFO org.apache.pulsar.metadata.impl.ZKSessionWatcher - Got ZK session watch event: WatchedEvent state:Closed type:None path:null
2022-12-09T22:13:57,102+0800 [main-EventThread] INFO org.apache.zookeeper.ClientCnxn - EventThread shut down for session: 0x200922313ee0000
2022-12-09T22:13:57,208+0800 [main] INFO org.apache.zookeeper.ZooKeeper - Session: 0x10092230ff60000 closed
2022-12-09T22:13:57,209+0800 [main-EventThread] INFO org.apache.pulsar.metadata.impl.ZKSessionWatcher - Got ZK session watch event: WatchedEvent state:Closed type:None path:null
2022-12-09T22:13:57,209+0800 [main] INFO org.apache.pulsar.PulsarClusterMetadataSetup - Cluster metadata for 'pulsar-demo' setup correctly
2022-12-09T22:13:57,209+0800 [main-EventThread] INFO org.apache.zookeeper.ClientCnxn - EventThread shut down for session: 0x10092230ff60000

Starting bookies and brokers

1. In the apache-pulsar-2.9.3 directory, use the following command to run all the bookies in the background.

./bin/pulsar-daemon start bookie

You can also run them in the foreground.

./bin/pulsar bookie

2. Verify that the BookKeeper cluster is working properly.

./bin/bookkeeper shell bookiesanity

3. The result may look like this:

2022-12-09T22:32:43,974+0800 [main] INFO  org.apache.zookeeper.ZooKeeper - Session: 0x200922313ee0004 closed
2022-12-09T22:32:43,974+0800 [main] INFO org.apache.bookkeeper.tools.cli.commands.bookie.SanityTestCommand - Bookie sanity test succeeded
2022-12-09T22:32:43,975+0800 [main-EventThread] INFO org.apache.zookeeper.ClientCnxn - EventThread shut down for session: 0x200922313ee0004

4. Similarly, run all the brokers in the background.

./bin/pulsar-daemon start broker

You can also run them in the foreground.

./bin/pulsar broker

5. Check the broker list.

./bin/pulsar-admin brokers list pulsar-demo

6. Expected result:

"node1:8080"
"node2:8080"
"node3:8080"

Testing the Pulsar cluster

With the Pulsar cluster up and running, let’s create a consumer and a producer to do some basic tests.

1. In the apache-pulsar-2.9.3 directory, run the following command to edit the client configuration file.

vi conf/client.conf

2. Add the web service URL and the broker service URL (you can use the local broker directly). If you want to use a remote client to connect to the Pulsar cluster, you need to use the public IP address of any of the Pulsar nodes for the following fields and add the hostname and IP address to the /etc/hosts file of your remote machine.

webServiceUrl=http://node3:8080
brokerServiceUrl=pulsar://node3:6650

3. In the apache-pulsar-2.9.3 directory of one of the nodes, run the following command to create a consumer.

./bin/pulsar-client consume demo -n 100 -s "pulsar-consumer"

4. Create a producer by running the following command on another node.

./bin/pulsar-client produce demo -n 10 -m "hello world"

5. Expected output on the consumer side:

2022-12-10T12:15:01,844+0800 [pulsar-client-io-1-1] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [demo][pulsar-consumer] Subscribing to topic on cnx [id: 0x0962d75f, L:/172.16.0.251:40102 - R:node1/172.16.0.251:6650], consumerId 0
2022-12-10T12:15:01,917+0800 [pulsar-client-io-1-1] INFO org.apache.pulsar.client.impl.ConsumerImpl - [demo][pulsar-consumer] Subscribed to topic on node1/172.16.0.251:6650 -- consumer: 0
2022-12-10T12:15:15,871+0800 [pulsar-client-io-1-1] INFO com.scurrilous.circe.checksum.Crc32cIntChecksum - SSE4.2 CRC32C provider initialized
----- got message -----
key:[null], properties:[], content:hello world
----- got message -----
key:[null], properties:[], content:hello world
----- got message -----
key:[null], properties:[], content:hello world
----- got message -----
key:[null], properties:[], content:hello world
----- got message -----
key:[null], properties:[], content:hello world
----- got message -----
key:[null], properties:[], content:hello world
----- got message -----
key:[null], properties:[], content:hello world
----- got message -----
key:[null], properties:[], content:hello world
----- got message -----
key:[null], properties:[], content:hello world
----- got message -----
key:[null], properties:[], content:hello world

6. Expected output on the producer side:

2022-12-10T12:15:15,766+0800 [pulsar-client-io-1-1] INFO  org.apache.pulsar.client.impl.ProducerStatsRecorderImpl - Pulsar client config: {"serviceUrl":"pulsar://node3:6650","authPluginClassName":null,"authParams":null,"authParamMap":null,"operationTimeoutMs":30000,"lookupTimeoutMs":30000,"statsIntervalSeconds":60,"numIoThreads":1,"numListenerThreads":1,"connectionsPerBroker":1,"useTcpNoDelay":true,"useTls":false,"tlsTrustCertsFilePath":"","tlsAllowInsecureConnection":false,"tlsHostnameVerificationEnable":false,"concurrentLookupRequest":5000,"maxLookupRequest":50000,"maxLookupRedirects":20,"maxNumberOfRejectedRequestPerConnection":50,"keepAliveIntervalSeconds":30,"connectionTimeoutMs":10000,"requestTimeoutMs":60000,"initialBackoffIntervalNanos":100000000,"maxBackoffIntervalNanos":60000000000,"enableBusyWait":false,"listenerName":null,"useKeyStoreTls":false,"sslProvider":null,"tlsTrustStoreType":"JKS","tlsTrustStorePath":"","tlsTrustStorePassword":"*****","tlsCiphers":[],"tlsProtocols":[],"memoryLimitBytes":0,"proxyServiceUrl":null,"proxyProtocol":null,"enableTransaction":false,"socks5ProxyAddress":null,"socks5ProxyUsername":null,"socks5ProxyPassword":null}
2022-12-10T12:15:15,779+0800 [pulsar-client-io-1-1] INFO org.apache.pulsar.client.impl.ConnectionPool - [[id: 0x13c00d2b, L:/172.16.0.249:47370 - R:node1/172.16.0.251:6650]] Connected to server
2022-12-10T12:15:15,781+0800 [pulsar-client-io-1-1] INFO org.apache.pulsar.client.impl.ProducerImpl - [demo] [null] Creating producer on cnx [id: 0x13c00d2b, L:/172.16.0.249:47370 - R:node1/172.16.0.251:6650]
2022-12-10T12:15:15,798+0800 [pulsar-client-io-1-1] INFO org.apache.pulsar.client.impl.ProducerImpl - [demo] [pulsar-demo-7-1] Created producer on cnx [id: 0x13c00d2b, L:/172.16.0.249:47370 - R:node1/172.16.0.251:6650]
2022-12-10T12:15:15,839+0800 [main] INFO com.scurrilous.circe.checksum.Crc32cIntChecksum - SSE4.2 CRC32C provider initialized
2022-12-10T12:15:15,991+0800 [main] INFO org.apache.pulsar.client.impl.PulsarClientImpl - Client closing. URL: pulsar://node3:6650
2022-12-10T12:15:15,999+0800 [main] INFO org.apache.pulsar.client.impl.ProducerStatsRecorderImpl - [demo] [pulsar-demo-7-1] Pending messages: 0 --- Publish throughput: 44.18 msg/s --- 0.00 Mbit/s --- Latency: med: 10.000 ms - 95pct: 45.000 ms - 99pct: 45.000 ms - 99.9pct: 45.000 ms - max: 45.000 ms --- Ack received rate: 44.18 ack/s --- Failed messages: 0 --- Pending messages: 0
2022-12-10T12:15:16,007+0800 [pulsar-client-io-1-1] INFO org.apache.pulsar.client.impl.ProducerImpl - [demo] [pulsar-demo-7-1] Closed Producer
2022-12-10T12:15:16,013+0800 [pulsar-client-io-1-1] INFO org.apache.pulsar.client.impl.ClientCnx - [id: 0x7196b5c1, L:/172.16.0.249:52134 ! R:node3/172.16.0.249:6650] Disconnected
2022-12-10T12:15:16,031+0800 [pulsar-client-io-1-1] INFO org.apache.pulsar.client.impl.ClientCnx - [id: 0x13c00d2b, L:/172.16.0.249:47370 ! R:node1/172.16.0.251:6650] Disconnected
2022-12-10T12:15:18,047+0800 [main] INFO org.apache.pulsar.client.cli.PulsarClientTool - 10 messages successfully produce

Conclusion

When you build a Pulsar cluster from the ground up, there are many nuts and bolts in the installation you need to pay attention to. For production, if you are looking for a fully-managed Pulsar solution without worrying about all these configurations and details, you can try StreamNative Cloud.

Reference

Deploy a cluster on bare metal

Pulsar configuration

--

--