Implementing a distributed cache using ballerina language was my first internship project at WSO2. At that time I was a second year undergraduate with decent programming skills but I had no idea what distributed computing is, so this project was a unique experience for me. This article is an attempt to explain the exposure I gained on distributed systems and the process of designing and implementing a distributed cache from the scratch.
Requirements and the Reasoning
This is a research project which was intended to introduce the intern to the fundamentals of distributed computing as it covers lots of concepts of distributed computing. You can checkout the learning outcome section for more details on this.
The requirement was to implement a self contained distributed cache from the scratch using ballerina language. Which means I am not allowed to use tools such as Zookeeper. Ballerina is a new cloud native programming language designed by WSO2 to make it easier for developers to build distributed systems. I believe they needed to test if the language is capable of building a complex distributed application such as this one. Additionally since ballerina did not have existing libraries at that time, I had to learn and implement these algorithms as well.
When I started the project but I didn’t even know what a distributed cache is. So I had to do some research on the existing systems to understand the domain and its use cases. As a result, I found solutions such as Memcached, Hazelcast IMDG, EHCache. Then I carried out in-depth research on the features, architecture and implementation of these platforms and it gradually helped me to familiarize myself with the domain and gain in-depth understanding the e design and implementation of a distributed cache.
While I was researching on the existing systems, I found out developers are bit reluctant to use distributed caches for various reasons.
Upon Cluster Membership change, cache executes an expensive relocation task to recover the possible data loss caused by the membership change. Then when the node is back online, it will cause another data recovery task. If the network is unreliable, this could occur repeatedly. This result in a denial of service ultimately crashing the entire cluster. which could overloading the database and significantly degrade performance of the application.
Further explanation of this problem can be found in the following blog post.
Evaluation of the algorithms for each component was the most challenging and interesting part of this project. I had to learn all the related concepts and techniques that could be used in the components and evaluate the most suitable one for each component.
The main design decision was to ensure the reliability of the cluster members. In order to achieve this requirement, RAFT Consensus protocol was selected. RAFT is the reliability behind popular platforms such as Kubernetes, etcd.
Raft Protocol consist of following mechanisms
- Heartbeat mechanism to constantly monitor the status of the nodes of the cluster.
- Leader Election mechanism to elect a leader for the cluster and control the cluster at a given time.
- Log Replication mechanism to ensure the data safety of the state machine.
If you are interested about this awesome protocol you can find more information from this blog
Understanding RAFT Distributed Consensus
Distributed Consensus is one of the hardest problems in the distributed computing domain and there have been many…
Raft consensus consist of heartbeat mechanism, however it does not contain a proper failure detector. Since the problem is originated by not detecting failures properly a failure detector was designed as an extension to Raft heartbeat mechanism.
The failure detector was designed by keeping correctness in mind. The leader of the cluster keeps a log the health of all the nodes and reduces the health if its not responsive for the heartbeats. When the node health is low, the node will be removed from the cache distribution and will be monitored for certain time before adding back to the cache.
When a node is in the monitoring phase, the leader of the cluster will keep sending heartbeats with RPC calls to the node that is being monitored. This monitor also acts as a advanced circuit breaker for detecting failures. All the new cluster members and the recovered nodes will be health checked before adding in the cache distribution.
Cache Entry Distributor
Several factors were taken into account when the cache entry distributor was designed. An extended version of Consistent hashing algorithm was used for this entry distribution. The Following factors were taken to account when selecting the Consistent Hashing with Bounded Loads algorithm.
- Write and Read Performance
Consistent Hashing is a distributed hashing scheme that operates independently of the number of servers or objects in a distributed hash table by assigning them a position on an abstract circle, or hash ring. This allows servers and objects to scale without affecting the overall system.
Entry replication is a crucial factor in any distributed environment. Consistent hashing provides a simple replication mechanism that replicates data without duplicating in the same node.
3. Data relocation
Consistent hashing provides a way to minimize the number of keys that needs to be relocated when adding or removing servers.
4. Load Balanced
Inconsistency of the data distribution is one of the main disadvantage of consistent hashing. Consistent Hashing with Bounded Loads algorithm enables to populate the entries to the cluster evenly without overloading any of the nodes.
As mentioned in the beginning of the blog, I had no prior experience in distributed computing. I’ve learned about CAP theorem, Consensus algorithms, Membership schemes and protocols, Low overhead network protocols like gRPC, Message broadcasting strategies, Data distribution algorithms, hash functions, fault tolerance, Chaos engineering, event ordering and time, data replication strategies,Cloud design patterns and formal definition languages such as TLA+.
If you are absolutely new to distributed computing, above keywords might be bit overwhelming. You can use the following resources to self study on distributed computing. Once you understand the fundamentals, learning specific algorithms is a very easy task.
Distributed systems theory for the distributed systems engineer
Updated June 2018 with content on atomic broadcast, gossip, chain replication and more Gwen Shapira, who at the time…
Offered by University of Illinois at Urbana-Champaign. The Cloud Computing Specialization takes you on a tour through…
A (hopefully) curated list on awesome material on distributed systems, inspired by other awesome frameworks like…
You can implement your own distributed cache as a hands on project to put your learning into use. Feel free to evaluate and use other algorithms if you believe its better than what I’ve used. Don’t forget to comment your reasoning in the comments as well.