Simplify Leader Election in Distributed System
A Distributed system can have multiple nodes/processes to achieve a common goal. Nodes communicate with each other using shared memory, to execute a task effectively the nodes communicate with each other to make a decision. The communication between nodes can be time-consuming and thus the decision-making process.
To overcome this symmetry of a distributed system, a Leader node is elected from the group of available nodes to reduce the complexity of the decision-making process.
Leader Election is a process of electing a Leader node from system nodes.
A Leader node is responsible for coordination, synchronization, and decision-making in the Leader Election process, ensuring smooth operation.
The Leader node is elected using the election process when it crashes. Election algorithms are used to elect the leader node from the network nodes, that is Leader Election in a nutshell.
Why Leader Election Matters
Leader election addresses several critical challenges in distributed systems:
- Coordination: A central leader simplifies coordination tasks like resource allocation, task assignment, and conflict resolution.
- Consistency: Leaders maintain a consistent global state, preventing conflicting updates and ensuring data integrity.
- Fault Tolerance: A new leader can be elected if a leader fails, ensuring system continuity and resilience.
Bully Algorithm
The Bully Algorithm is based on the Ranks of nodes, The Leader is elected based on the highest rank of the node after the election process.
A node with a higher rank as a Leader node will essentially “bully” nodes with lower ranks to step down, hence the name Bully Algorithm.
Working
- When a current leader fails and the election is triggered, a node sends an election message to all nodes with higher ranks.
- If a node receives a message from a lower-rank node, it responds with an “OK” message implying it is alive. If it does not receive any response, it assumes the higher rank node has failed.
- The node with the highest rank which did not get any “OK” response declares itself the winner and becomes the new Leader.
- The new leader node sends a “Victory” message to other nodes.
Algorithm
If (Node n realizes(leader crash))
sends an ELECTION message to all node with higher number
If(response from higher node)
select as LEADER and send ok message back to the sender to indicate that he is alive and will take over.
else node n is considered as LEADER
- Node N3 detects the Leader node has crashed, and N3 transmits an election message to all higher rank nodes i.e. N4 and N5.
- N4 and N5 respond with an “OK” message, and N3 notifies N5 is the highest-ranked node.
- N5 becomes the new Leader node.
- N5 being elected as the Leader node notifies all available nodes about the newly appointed leader.
Disadvantages
- It requires that every node should know the identity of every other node in the system that takes large space in the system.
- The message passing has order O(n2).
Ring Algorithm
The Ring Algorithm is another election algorithm used for leader election in a system that is based on the use of a ring. The nodes are physically or logically ordered with their successors.
When any node detects the leader's failure, a new election process starts. Every node communicates with its successor, if the node is unavailable then the node attempts to contact the node after the unavailable node.
All live nodes add themselves to the available set and pass it further to the next available node in a ring.
When the message comes back to the node that started the process the election process is completed, and the highest rank node from the list is appointed as a new Leader.
- The leader node is crashed and node N3 detects that and starts the election process.
- N3 adds itself to the Available nodes list and passes it over to the next successor i.e. N2.
- Every available node adds itself to the list but N0 skips the Leader node due to its unavailability and passes the list to N4.
- N3 finally receives back the list of active nodes and finds N5 to be the highest-ranked node, thus N5 is elected as the new leader throughout the ring and communicates about the new leader to every node in the ring.
Conclusion
A distributed system is a collection of computers, that work together but all systems are independent. A Leader is selected by the Leader election process that makes final decisions.
Follow me on Linkedin and GitHub
References
https://www.linkedin.com/pulse/leader-election-distributed-systems-deep-dive-saurav-prateek/