Distributed Systems Configuration Management Using Apache Zookeeper

Irtiza

Published in

OneByte

3 min readAug 12, 2020

Overview

This story is about utilizing Apache Zookeeper for Configuration Management of Distributed Systems.

There are a lot of problems in managing and designing Distributed Systems:

Security
Scalability
Failure handling
Configuration Management
Concurrency
Transparency
Quality of service

But in this story, we will focus on the configuration management of distributed system.

Assumptions

This story has been written by keeping the following assumptions in mind:

A distributed system’s services use configuration to operate.
Configuration can be changed at runtime and the services should not be restarted to load new configurations.

Details

Suppose we have a distributed system that has 100 services running that process the message stream from multiple sources based on provided configuration.

In the above scenario, how can we manage the configurations?

Some basic solutions and their shortfall are given below:

The above problem can be solved by using Zookeeper.

Zookeeper

It is an open-source project that provides a distributed configuration and synchronization service.

It is used in the following projects.

It stores the data in a hierarchical key-value store:

How does it solve our problem:

It can be deployed as a multi-node service to maintain high availability.
We can store client-specific configuration on different data nodes to maintain data/configuration segregation.
API client is available in most languages.
On nodes, we can configure a watch method that will be executed whenever there is a change on a node which makes the service dynamic and agile.

Demo

A repository has already been created containing the manifests required for the demo. In the demo following actions will be performed:

A service will be deployed that will retrieve data from a data node and configure a watch method on the data node for change detection. It will continually print the configuration retrieved from the terminal's data node.
Data will be added to the data node.
Data will be updated on the data node. Which will trigger the watch method of the service and will update the configuration. The updated configuration can be seen in the logs of the service.

Detailed instructions can be found in the README file of the repository.

Final Thoughts

I hope you like this story and please give feedback about anything that can be improved or I have missed. Thank you :)