Zookeeper Setup with Kafka Cluster : Part 1

The theory behind using Zookeeper with Kafka Cluster

Sara M.
6 min readMar 20, 2024
Architecture using Zookeeper

What is Zookeeper ?

Zookeeper is a pillar for so many distributed applications.

It provides a lot of features:

  • The first one is distributed configuration management so it can manage the configuration of distributed systems.
  • Election and consensus: That means if servers ask who’s the leader? zookeeper will respond.
  • Coordination and locks.
  • In Kafka’s case, it does key value store so it can store many configuration for topics and brokers...

Zookeeper is not only used with Kafka, but can be used with Hadoop and other big data systems.

Zookeeper is like a file system, It has an internal structure like a tree.

We all know what file system looks like.

Tree structure, photo by Author

We have the root at the very top and then you have branches and at the bottom you have leaves or nodes.

  • So each node is called a zNode.
  • Each zNode has a path.
  • Each zNode can be persistent or ephemeral. A persistent zNode is something that will stay alive all the time, Zookeeper will remember it all the time. An ephemeral zNode is a zNode that will just go away if the app disconnects.
  • Each node can store multiple zNode.
  • You cannot rename the node.
  • Each zNode can be wached for changes. So if changes occurs on some zNode that is wached, he will let you know.

What is the role of Zookeeper?

Let’s see in detail what is the role of Zookeeper in Kafka.

  • One of the most important thing that he does is broker registration with heartbeat mechanisms to keep the list current.

Heartbeats are very important when you have server to server communications.

If a broker gets disconnected, Zookeeper will remove it because it hasn’t sent a heartbeat.

A heartbeat is like a message that usually servers communicate to each other to say that they are alive.

  • It also maintains the list of topics, so all the topics in Kafka are configured in Zookeeper.

So it has their configuration, which includes the number of partitions, the replication factor and additional configuration.

But it also has the list of ESR, which is the list of in sync Replicas for partitions.

  • Zookeeper will be used to perform leader elections.

Zookeeper is going to do a vote between all his servers and elect a new leader .

leader election is very important, the faster it’s done, the faster your server will be up and running when one goes down.

  • Zookeeper will be used to store the Kafka cluster ID(randomly generated at the first startup of the cluster).
  • Zookeeper will be used to store the ACLs if security is enabled.

Access control list or ACLs can be related to topics, consumer groups and users.

  • Zookeeper will be used to store quotas configuration if enabled.

Zookeeper Quorums

Zookeeper has a strict majority of servers up with 2N+1 quorums.

If you have one server zookeeper in your quorum, zero servers can go down. So if you have one and it goes down, it’s broken.

If you have three servers, only one can go down.

If you have five servers, two can go down.

Let’s do a comparison between 1, 3, 5 zookeepers to understand which are suitable for certain cases and which are not.

1 zookeeper

So let’s talk about the easiest, which has one zookeeper in the quorum.

1 zookeeper, photo by Author

When we have 1 zookeeper, it’s very easy to put in place without worrying about distribution.

But it’s clear that it’s not resilient at all !

If that zookeeper crashes, your Kafka cluster will crash.

So Zookeeper one instance is really good for development purposes, but not good at all for production deployments.

3 zookeepers

3 zookeeper, photo by Author

In the case of 3 zookeepers one can go down as explained before.

This kind of server quorum setup is preferred for small Kafka deployments.

A lot of deployments have done only have three Zookeeper servers, and that’s more than enough.

Now let’s talk about five.

5 zookeepers

5 zookeeper, photo by Author

As you can see, it becomes more complicated. Each zookeeper needs to talk with all each other. It’s very useful for big Kafka deployments. In this case, two servers have the possibility to go both down.

Big companies like LinkedIn and Netflix, will have at least five Zookeeper servers and this allows for two servers to go down, which gives a little bit more flexibility in production.

But on the other hand, you need very performant machines.

There’s so much networking happening and zookeeper is very sensible to latency, you need very performant machines and good structural decisions about the kind of machines and network cables you’re going to use.

Zookeeper Configuration

Zookeeper configuration and optimization could be really tricky. It depends on many configurations about the network and the cluster used, but there are some defaults we’re gonna use.

  • First of all, Zookeeper needs to store data. So you need to indicate a data directory “dataDir”.
  • It has a port and usually is 2181.
  • MaxClientCnxns” connection will set it to zero, which means unlimited. You could disable it by setting a number of maximum number of connections if you want to.
  • The “tickTime ”is set to define heartbeats and 2000 (2 sec) is a good number.
  • initLimit ”is how many ticks they can take to initialize and synchronize. So if initLimit is 10, that means that 20 second (10*2s) can be used for the initial synchronization. if 20 seconds are over, It will fail.
  • syncLimit ”is how many sync are possible to be passed between a request and an acknowledgement.
  • Zookeepers servers “server.(n)” to define all your zookeeper servers

These configs are really important because they define timeouts for Zookeeper and these show how important network latency and performant machines are for a performant Zookeeper.

Next step is to setup Zookeeper and Kafka cluster in AWS to see in depth all what we have learnt.

Thank you 🙏 for reading the article. Before you go, you can clap my article (50 times 👏), that will really help me out and boost this article to others ✍🏻❤️. Follow me on Medium to get my latest article.

Thank you 🫶!

--

--

Sara M.

A Data Specialist 📊 Eager to learn ? then follow me !