How to deploy a Zookeeper and Kafka cluster in Google Cloud Platform
One of the great advantages of Google Cloud Platform is how easy and fast it is to run experiments. For example, you can easily spin up a Zookeper and Kafka cluster in a matter of minutes with very little configuration.
For reference, you can find the sample properties files for each server in this repository: Cluster Config Examples
Let’s spin up 3 machines that will form our Zookeeper cluster. The command below can be run from the Google Cloud Shell or from your PC using the Google Cloud Platform CLI. All that you need to substitute are the project and instance name (e.g. zook-1)
gcloud compute --project "[YOUR-PROJECT]" instances create "[INSTANCE-NAME]"\
--zone "europe-west1-c"\
--machine-type "n1-standard-1"\
--subnet "default"\
--maintenance-policy "MIGRATE"\
--service-account "1077112676311-compute@developer.gserviceaccount.com"\
--scopes "https://www.googleapis.com/auth/devstorage.read_only","https://www.googleapis.com/auth/logging.write","https://www.googleapis.com/auth/monitoring.write","https://www.googleapis.com/auth/servicecontrol","https://www.googleapis.com/auth/service.management.readonly","https://www.googleapis.com/auth/trace.append"\
--tags "http-server"\
--image "centos-7-v20170523"\
--image-project "centos-cloud"\
--boot-disk-size "10"\
--boot-disk-type "pd-standard"\
--boot-disk-device-name "[INSTANCE-NAME]"
Once the VM is running, we can Secure Shell (SSH) it and run the following command to install some dependencies and download Kafka.
# Installing dependencies
sudo su
yum install -y wget nano java
# Automatically creating the Zookeeper myid file
mkdir /tmp/zookeeper
touch /tmp/zookeeper/myid
echo "ZOOKEEPER-ID" > zookeeper/myid
# Downloading and extracting Kafka
wget http://mirror.ox.ac.uk/sites/rsync.apache.org/kafka/2.7.0/kafka_2.12-2.7.0.tgz
tar -xvzf kafka_2.12-2.7.0.tgz
cd kafka_2.12-2.7.0.tgz
Modify the Zookeeper properties to include all the instance details.
nano config/zookeeper.properties
# In milliseconds
tickTime=2000
# In ticks
initLimit=10
syncLimit=5
maxClientCnxns=30
# All Zookeeper servers need to be aware of other Zookeepers part of the cluster
server.1=zook-1:2888:3888
server.2=zook-2:2888:3888
server.3=zook-3:2888:3888
Once we’ve updated the settings of all Zookeeper server properties, we can start them.
bin/zookeeper-server-start.sh config/zookeeper.properties
Kafka
Let’s spin up 3 machines that will form our Kafka cluster. All that you need to substitute are the project name, instance name (e.g. kafka-1), host name and broker Id.
gcloud compute --project "[YOUR-PROJECT]" instances create "[INSTANCE-NAME]"\
--zone "europe-west1-c"\
--machine-type "n1-standard-1"\
--subnet "default"\
--maintenance-policy "MIGRATE"\
--service-account "1077112676311-compute@developer.gserviceaccount.com"\
--scopes "https://www.googleapis.com/auth/devstorage.read_only","https://www.googleapis.com/auth/logging.write","https://www.googleapis.com/auth/monitoring.write","https://www.googleapis.com/auth/servicecontrol","https://www.googleapis.com/auth/service.management.readonly","https://www.googleapis.com/auth/trace.append"\
--tags "http-server"\
--image "centos-7-v20170523"\
--image-project "centos-cloud"\
--boot-disk-size "10"\
--boot-disk-type "pd-standard"\
--boot-disk-device-name "[INSTANCE-NAME]"
Let’s SSH to the VM and run the following command to install some dependencies and download Kafka.
# Installing dependencies
sudo su
yum install -y wget nano java
# Downloading and extracting Kafka
wget http://mirror.ox.ac.uk/sites/rsync.apache.org/kafka/2.7.0/kafka_2.12-2.7.0.tgz
tar -xvzf kafka_2.12-2.7.0.tgz
cd kafka_2.12-2.7.0.tgz
Now, update the Kafka server properties with the property values below:
nano config/server.properties
# Broker id needs to be unique for each Kafka server
broker.id=[BROKER-ID]
# This will be the Zookeeper servers created before
zookeeper.connect=zook-1:2181,zook-2:2181,zook-3:2181
# Examples of hostnames - kafka-1, kafka-2, kafka-3
host.name=[HOSTNAME]
Once we’ve updated the settings of all Kafka server properties, we can start them.
bin/kafka-server-start.sh config/server.properties