Consensus, Two Phase and Three Phase Commits

At the heart of every distributed system is a consensus algorithm. Consensus is an act wherein a system of processes agree upon a value or decision. In this post let’s look at two famous consensus protocol namely two phase and three phase commits widely in use with the database servers.

The processes propose values for others and then agrees upon a value if it’s confident that every other process also has agreed upon the same value. The consensus has three characteristics

Agreement — all nodes in N decide on the same value.

Validity — The value that’s decided upon should have been proposed by some process

Termination — A decision should be reached !!

Two phase commit

This protocol requires a coordinator. The client contacts the coordinator and proposes a value. The coordinator then tries to establish the consensus among a set of processes (a.k.a Participants) in two phases, hence the name.

1. In the first phase, coordinator contacts all the participants suggests value proposed by the client and solicit their response.

2. After receiving all the responses, the coordinator makes a decision to commit if all participants agreed upon the value or abort if someone disagrees.

3. In the second phase, coordinator contacts all participants again and communicates the commit or abort decision.

Img Courtesy, https://docs.particular.net/nservicebus/azure/understanding-transactionality-in-azure

We can see that all the above-mentioned conditions are met. The agreement is there because the participants only make a yes or no decision on the value proposed by the coordinator and don’t propose values. Validity is there because the same decision to commit or abort is enforced by the coordinator on all participants. Termination is guaranteed only if all participants communicate the responses to the coordinator. However, this is prone to failures.

When speaking about failures what are the types of failures of a node?

Fail-Stop Model, Nodes just crash and don’t recover at all.

Fail Recover Model, Nodes crash, and recover after a certain time and continue executing.

Byzantine failure, Nodes start behaving arbitrarily trying to interrupt the regular behavior.

Now coming to how failures affect 2 Phase commit,

1. Coordinator fails even before initiating phase 1. This literally means the consensus isn’t started at all and theoretically the protocol works correctly.

2. Coordinator fails after initiating phase 1. Some nodes have received the message from coordinator initiating a fresh round of 2PC. These nodes might have sent their responses and are blocked waiting for the 2nd phase of 2PC to start. This also means that no future consensus rounds of 2PC can start. One way out of this issue is to have time outs when waiting for responses. So when a node times out waiting for a response from the coordinator it can assume that coordinator is dead and take over the role as coordinator. It can reinitiate phase 1 and contact all other nodes asking them for the consensus based on the value for which this node voted as a participant before the actual coordinator crashed. However if another node crashes before recovery node gathers all messages of phase 1, then the protocol can’t proceed. This is because recovery node doesn’t know what’s the intended decision of the crashed node. If all other participant nodes have agreed to commit but the newly crashed node might have intended to abort. So the recovery node can’t call the decision as a commit. This argument applies vice versa also.

3. Similarly, if a participant fails during phase 1 before the coordinator receives a response from the participant, the protocol comes to a grinding halt. The reasoning is similar as point 2, because coordinator doesn’t know the result of failed node and hence can’t proceed to commit or abort the consensus.

4. Similarly, if coordinator fails during phase 2 we would want a node to take over and shepherd the protocol to completion. Another big issue is that if a participant node fails during commit phase the system is left to lurch in the dark because the coordinator doesn’t know whether the participant failed after committing or before committing. Hence coordinator can’t proactively decide whether the transaction is committed.

Summarizing one of the biggest disadvantages of two phase commit is that it’s blocking protocol. If a cohort sends an agreement message to the coordinator it holds the resources associated with the consensus till it receives the commit or abort message from the coordinator. The failure of coordinator will prevent the cohorts from recovering from a failure. The same reasoning applies to failure of a participant.

Three phase commit

This is an extension of two-phase commit wherein the commit phase is split into two phases as follows.

a. Prepare to commit, After unanimously receiving yes in the first phase of 2PC the coordinator asks all participants to prepare to commit. During this phase, all participants acquire locks etc, but they don’t actually commit.

b. If the coordinator receives yes from all participants during the prepare to commit phase then it asks all participants to commit.

Img Courtesy, CC BY 3.0, https://en.wikipedia.org/w/index.php?curid=16452453

The pre-commit phase introduced above helps us to recover from the case when a participant failure or both coordinator and participant node failure during commit phase. The recovery coordinator when it takes over after coordinator failure during phase2 of previous 2 pc the new pre-commit comes handy as follows. On querying participants, if it learns that some nodes are in commit phase then it assumes that previous coordinator before crashing has made the decision to commit. Hence it can shepherd the protocol to commit. Similarly, if a participant says that it doesn’t receive prepare to commit, then the new coordinator can assume that previous coordinator failed even before it started the prepare to commit phase. Hence it can safely assume no other participant would have committed the changes and hence safely abort the transaction.