Apache Zookeeper

(λx.x)eranga
Effectz.AI
Published in
5 min readJun 23, 2015

About Zookeeper

Distributed systems consist with multiple nodes(computers) which communicate and coordinate their actions by message passing. Coordinating simply means as act that multiple nodes must perform together. Following are some coordinations happens in distributed systems

  1. Leader election
  2. Group membership
  3. Locking
  4. Synchronisation
  5. Publisher/Subscriber

Its really hard to getting these types of node coordination correctly. Thats the place where zookeeper comes into play. Simply, Zookeeper is a highly available, reliable and scalable distributed system coordination service.

Zookeeper allows distributed processes to coordinate with each other through a shared hierarchical name space of data registers.

Suppose you have a distributed web server application running on 10 nodes. Say, you want to get total real-time hit count. One way to do this is to write an application which connects to the 10 nodes, gets count from each and present the sum. Alternatively, you can have each web server application write their hit counts to ZooKeeper on regular intervals and then query ZooKeeper to get the count

- Zookeeper wiki

Why need Zookeeper for coordination ?

While it’s possible to design and implement all of the previously mentioned coordination services from scratch, it’s extra work and difficult to debug any problems, race conditions, or deadlocks.

Just like you don’t go around writing your own random number generator or hashing function in your code, there was a need that people shouldn’t go around writing their own name services or leader election services from scratch every time they need it.

Moreover, you could hack together a very simple group membership service relatively easily, but it would require much more work to write it to provide reliability, replication, and scalability. This led to the development and open sourcing of Apache ZooKeeper, an out-of-the box reliable, scalable, and high-performance coordination service for distributed systems.

Zookeeper features

1. Locking

To allow for serialised access to a shared resource in your distributed system, you may need to implement distributed mutexes. ZooKeeper provides for an easy way for you to implement them.

2. Synchronisation

Hand in hand with distributed mutexes is the need for synchronising access to shared resources. Whether implementing a producer-consumer queue or a barrier, ZooKeeper provides for a simple interface to implement that.

3. Configuration management

You can use ZooKeeper to centrally store and manage the configuration of your distributed system. This means that any new nodes joining will pick up the up-to-date centralised configuration from ZooKeeper as soon as they join the system. This also allows you to centrally change the state of your distributed system by changing the centralised configuration through one of the ZooKeeper clients

4. Leader election

Your distributed system may have to deal with the problem of nodes going down, and you may want to implement an automatic fail-over strategy. ZooKeeper provides off-the-shelf support for doing so via leader election.

5. Name service

A name service is a service that maps a name to some information associated with that name. For an example DNS service is a name service that maps a domain name to an IP address. In your distributed system, you may want to keep a track of which servers or services are up and running and look up their status by name. ZooKeeper exposes a simple interface to do that. A name service can also be extended to a group membership service by means of which you can obtain information pertaining to the group associated with the entity whose name is being looked up.

How Zookeeper works ?

ZooKeeper follows a simple client-server model where clients are nodes (i.e., machines) that make use of the service, and servers are nodes that provide the service. A collection of ZooKeeper servers forms a ZooKeeper ensemble

ZooKeeper service uses the quorum model and will only be available if a majority of servers are alive. The servers in the ZooKeeper service must know about each other. Also, the clients should know the list of servers in the ZooKeeper service.

Zookeeper architecture

At any given time, one ZooKeeper client is connected to one ZooKeeper server(via TCP tunnel). Each ZooKeeper server can handle a large number of client connections at the same time. Each client periodically sends pings to the ZooKeeper server it is connected to let it know that it is alive and connected. The ZooKeeper server in question responds with an acknowledgment of the ping, indicating the server is alive as well. When the client doesn’t receive an acknowledgment from the server within the specified time, the client connects to another server in the ensemble, and the client session is transparently transferred over to the new ZooKeeper server.

For the service to be reliable and scalable, it is replicated over a set of machines. ZooKeeper uses a version of the famous “Paxos algorithm”, to keep replicas consistent. An in-memory image of the data tree, transaction logs and snapshots are stored in the replicated machines. Since the data is in-memory, it can achieve low latency and high throughput. However, complete replication limits the total size of data that can be managed using ZooKeeper.

Key concepts of Zookeeper

Atomic Broadcast

At the heart of ZooKeeper is an atomic messaging system that keeps all of the servers in sync.

Reliable delivery

If a message, “m”, is delivered by one server, it will be eventually delivered by all servers.

Total order

If a message is delivered before message “b” by one server, “a” will be delivered before “b” by all servers. If “a” and “b” are delivered messages, either “a” will be delivered before “b” or “b” will be delivered before “a”.

Causal order

If a message “b” is sent after a message “a” has been delivered by the sender of “b”, “a” must be ordered before “b”. If a sender sends “c” after sending “b”, “c” must be ordered after “b”.

Zookeeper use-cases

Zookeeper is ideal for the web-scale, mission critical applications. Also ZooKeeper is popular among multi-host, multi-process applications (C, Java bindings) running in data centers. Following are some zookeeper use cases

1. Hbase

HBase is the Hadoop database. Its an open-source, distributed, column-oriented store modeled after the Google paper, Bigtable. They use zookeeper for master election, server lease management, bootstrapping, and coordination between servers

2. Neo4j

Neo4j is a Graph Database. It’s a disk based, ACID compliant transactional storage engine for big graphs and fast graph traversals, using external indicies like Lucene/Solr for global searches. They use ZooKeeper in the Neo4j High Availability components for write-master election, read slave coordination

3. 101tec

101tec consulting in the area of enterprise distributed systems. They use zookeeper to manage a system build out of hadoop, katta, oracle batch jobs and a web component

4. katta

Katta serves distributed Lucene indexes in a grid environment. Zookeeper is used for node, master and index management in the grid

5. Rackspace

In Rackspace, the Email & Apps team uses ZooKeeper to coordinate sharding and responsibility changes in a distributed e-mail client that pulls and indexes data for search. ZooKeeper also provides distributed locking for connections to prevent a cluster from overwhelming servers

--

--