Split brain in distributed systems

DhineshSunder Ganapathi
Nerd For Tech
Published in
2 min readJun 12, 2022

--

In a distributed environment with a central (or leader) server, if the central server dies, the system must quickly find a substitute, otherwise, the system can quickly deteriorate. The scenario in which a distributed system has two or more active leaders is called split-brain. In this article, let’s learn about implication of split brain and how we can handle this.

Photo by Jackson Simmer on Unsplash

Split brain is a state of a server cluster where nodes diverge from each other and have conflicts when handling incoming I/O operations. The servers may record the same data inconsistently or compete for resources. This will usually shut the cluster off while the nodes wait for some direction on how to solve the conflict, which leads to downtime for your servers or even worse, data corruption.

PROBLEM

One of the problems is that we cannot truly know if the leader has stopped for good or has experienced an intermittent failure like a stop-the-world GC pause or a temporary network disruption. Nevertheless, the cluster has to move on and pick a new leader. If the original leader had an intermittent failure, we now find ourselves with a so-called zombie leader. A zombie leader can be defined as a leader node that had been deemed dead by the system and has since come back online. Another node has taken its place, but the zombie leader might not know that yet. The system now has two active leaders that could be issuing conflicting commands. How can a system detect such a scenario, so that all nodes in the system can ignore requests from the old leader and the old leader itself can detect that it is no longer the leader?

Solution

Every time a new leader is elected, the generation number gets incremented. This means if the old leader had a generation number of ‘1’, the new one will have ‘2’. This generation number is included in every request that is sent from the leader to other nodes. This way, nodes can now easily differentiate the real leader by simply trusting the leader with the highest number. The generation number should be persisted on disk, so that it remains available after a server reboot. One way is to store it with every entry in the Write-ahead Log.

Examples

Kafka: To handle Split-brain (where we could have multiple active controller brokers), Kafka uses ‘Epoch number,’ which is simply a monotonically increasing number to indicate a server’s generation.

HDFS: ZooKeeper is used to ensure that only one NameNode is active at any time. An epoch number is maintained as part of every transaction ID to reflect the NameNode generation.

Hope this article is helpful in understanding the split brain problem in distributed systems and how to mitigate them, until next time Adios!

--

--

DhineshSunder Ganapathi
Nerd For Tech

Data Engineer, Published Author, Book Worm, Tech Blogger, Intrigued to learn new things