An unsafe ambiguity in Paxos Made Simple

Michael Deardeuff
8 min readJan 31, 2022

--

Note: I originally wrote about this circa 2015, but that paper was never published and is now lost. I have repeatedly received requests about this ambiguity, so here is a re-write of that original (lost) paper. In many ways this paper is clearer and simpler.

Introduction

Paxos Made Simple [1] is an important paper in the development of distributed systems. It presents a simple consensus protocol used as the basis of other protocols and is also used in large-scale production systems. Unfortunately, Paxos Made Simple includes a line of prose that makes it easy to nullify its fundamental safety invariant: making it possible to chose multiple values.

In this paper we introduce two different styles of Paxos. We then show that Paxos Made Simple inadvertently mixes these two interpretations. Finally, we give a scenario where this mixed-style Paxos fails its safety invariant.

Two styles of Paxos

There are two interpretations of the Paxos protocol used in the literature. We introduce the terms Strong Accept Paxos and Strong Prepare Paxos to differentiate them. The following are descriptions for both interpretations. Strong Prepare Paxos is taken verbatim from Paxos Made Simple (pages 5–6). The differences between the two interpretations are highlighted (they are in Phase 2).

Strong Accept Paxos

Phase 1. (a) A proposer selects a proposal number n and sends a prepare request with number n to a majority of acceptors.
(b) If an acceptor receives a prepare request with number n greater than that of any prepare request to which it has already responded, then it responds to the request with a promise not to accept any more proposals numbered less than n and with the highest-numbered proposal (if any) that it has accepted.

Phase 2. (a) If the proposer receives a response to its prepare requests (numbered n) from a majority of acceptors, then it sends an accept request to each acceptor for a proposal numbered n with a value v, where v is the value of the highest-numbered proposal among the responses, or is any value if the responses reported no proposals.
(b) If an acceptor receives an accept request for a proposal numbered n, it accepts the proposal unless it has already responded to a prepare request or an accept request having a number greater than n. It responds to the request with a promise not to accept any more proposals numbered less than n.

Strong Prepare Paxos

Phase 1. (a) A proposer selects a proposal number n and sends a prepare request with number n to a majority of acceptors.
(b) If an acceptor receives a prepare request with number n greater than that of any prepare request to which it has already responded, then it responds to the request with a promise not to accept any more proposals numbered less than n and with the highest-numbered proposal (if any) that it has accepted.

Phase 2. (a) If the proposer receives a response to its prepare requests (numbered n) from a majority of acceptors, then it sends an accept request to each of those acceptors for a proposal numbered n with a value v, where v is the value of the highest-numbered proposal among the responses, or is any value if the responses reported no proposals.
(b) If an acceptor receives an accept request for a proposal numbered n, it accepts the proposal unless it has already responded to a prepare request having a number greater than n.

An unsafe mixing of the two interpretations

It is safe to add the highlighted portions of both styles to a Paxos implementation — the guarantees are strictly stronger; we call this variant Doubly Strong Paxos. But removing the highlighted portions of both styles leads to a violation of the safety invariant that “Only a single value is chosen,” as we will show in the next section. We call this latter variant Unsafe Paxos.

Unfortunately, just before it presents Strong Prepare Paxos the paper Paxos Made Simple implies that the implementation doesn’t need its strengthener with the following quote (Paxos Made Simple page 5, emphasis ours):

A proposer issues a proposal by sending, to some set of acceptors, a request that the proposal be accepted. (This need not be the same set of acceptors that responded to the initial requests.) Let’s call this an accept request.

That statement only makes sense in the context of Strong Accept Paxos. But unsuspecting implementers may (and have) applied that statement in the context of Strong Prepare Paxos, rendering their implementations into Unsafe Paxos. This is understandable given this quote’s proximity to the Strong Prepare Paxos description in Paxos Made Simple.

We have found this error in several implementations of Paxos on Github. We do not reference them here.

A fatal scenario

In this section we present a scenario where the Unsafe Paxos violates the safety invariant that only a single value is chosen. We then show the same scenario with both Strong Accept Paxos and Strong Prepare Paxos to show that they keep the invariant.

The scenario in this section is not the only failing scenario, just one of an infinite number of failing scenarios for Unsafe Paxos.

The scenario has three acceptors (A1, A2, A3) and at least two proposers (P1, P2, P*) where P* is any proposer not limited to P1, P2. This scenario calls for specific dropped and delayed messages; in particular it has a message delay to the first proposer (P1).

The scenario is the following:

  1. P1 sends prepare, but A3 drops the message
  2. P2 sends prepare, but A3 drops the message
  3. P2 sends accept, but A1 drops the message
  4. P1 sends accept, but A1 drops the message
  5. P* sends prepare, but A2 drops the message
  6. P* sends accept

The following traces show how the acceptor’s variables change through the scenario; p is the promise made in Phase 1b (and in Phase 2b for Strong Accept Paxos); a is the pair of the value and its proposal number as saved in Phase 2b. If the variable is blank on a line, it has the same value as the line above it, or null if the variable was never set.

Unsafe Paxos

A1  | A2  | A3  | Notes
p a | p a | p a |
----+-----+-----+--------
| | | P1 sends Prepare(1) to A1, A2, A3 (A3 drops it)
1 | 1 | | A1, A2 send PrepareAck(null) to P1
| | | P2 sends Prepare(2) to A1, A2, A3 (A3 drops it)
2 | 2 | | A1, A2 send PrepareAck(null) to P1
| | | P2 sends Accept(y@2) to A1, A2, A3 (A1 drops it)
| y@2| y@2| A2, A3 send AcceptAck to P2
** y is committed ** | | | P1 sends Accept(x@1) to A1, A2, A3 (A1 drops it)
| | x@1| A3 sends AcceptAck to P1;
A2 sends AcceptNack to P1
| | | P* sends Prepare(3) to A1, A2, A3 (A2 drops it)
3 | | 3 | A1 sends PrepareAck(null);
A3 sends PrepareAck(x@1) to P*
| | | P* sends Accept(x@2) to A1, A2, A3
x@3| x@3| x@3| A1, A2, A3 send AcceptAck to P*
** x is committed; clobbering y **

Strong Accept Paxos

A1  | A2  | A3   | Notes
p a | p a | p a |
----+-----+------+--------
| | | P1 sends Prepare(1) to A1, A2, A3 (A3 drops it)
1 | 1 | | A1, A2 send PrepareAck(null) to P1
| " | | P2 sends Prepare(2) to A1, A2, A3 (A3 drops it)
2 | 2 | | A1, A2 send PrepareAck(null) to P1
| " | | P2 sends Accept(y@2) to A1, A2, A3 (A1 drops it)
| y@2| 2 y@2| A2, A3 send AcceptAck to P2
** y is committed ** | | | P1 sends Accept(x@1) to A1, A2, A3 (A1 drops it)
| | | A2, A3 send AcceptNack to P1
| | | P* sends Prepare(3) to A1, A2, A3 (A2 drops it)
3 | | 3 | A1 sends PrepareAck(null);
A3 sends PrepareAck(y@2) to P*
| | | P* sends Accept(y@3) to A1, A2, A3
y@3| y@3| y@3| A1, A2, A3 send AcceptAck to P*

Strong Prepare Paxos

A1  | A2  | A3  | Notes
p a | p a | p a |
----+-----+-----+--------
| | | P1 sends Prepare(1) to A1, A2, A3 (A3 drops it)
1 | 1 | | A1, A2 send PrepareAck(null) to P1
| | | P2 sends Prepare(2) to A1, A2, A3 (A3 drops it)
2 | 2 | | A1, A2 send PrepareAck(null) to P1
| | | P2 sends Accept(y@2) to A1, A2 (A1 drops it)
| y@2| | A2 sends AcceptAck to P2
| | | P1 sends Accept(x@1) to A1, A2 (A1 drops it)
| | | A2 sends AcceptNack to P1
| | | P* sends Prepare(3) to A1, A2, A3 (A2 drops it)
3 | | 3 | A1, A3 send PrepareAck(null) to P*
| | | P* sends Accept(z@3) to A1, A3
z@3| | z@3| A1, A3 send AcceptAck to P*
** z is committed **

A plug for Strong Accept Paxos

We prefer Strong Accept Paxos over Strong Prepare Paxos from an engineering perspective. Strong Accept Paxos recovers faster from dropped messages because it can always communicate with any acceptor.

It is also easy to repair Unsafe Paxos into Strong Accept Paxos: the acceptor merely sets its promise variable on both prepare and accept messages.

Conclusion

We have introduced two styles of Paxos — one where an Accept is also a Promise (Strong Accept Paxos) and one where Proposers only send to the promising Acceptors (Strong Prepare Paxos). We pointed out that while Paxos Made Simple explicitly presents Strong Prepare Paxos, it slips in a nice property of Strong Accept Paxos that weakens Strong Prepare Paxos into an Unsafe Paxos. This property is the ability to send accept messages to any Acceptor. We presented one scenario where Unsafe Paxos has two values being chosen.

Paxos implementers and maintainers should check their implementation to ensure that they do not implement Unsafe Paxos. The authors think that Strong Accept Paxos is simpler to implement and easier to retrofit into an Unsafe Paxos implementation.

Acknowledgements

This stemmed from a question on Stack Overflow asked by the user lambda [3]. That question gave me an uncomfortableness for a week before coming to this realization. After searching around for a while, I found that several proprietary and open-source implementations of Paxos had implemented what I now call Unsafe Paxos.

Around that time I conversed with Leslie Lamport via email. He was the first to suggest that I write this paper. (I didn’t think the original paper was very good, which is why I didn’t try to publishing it anywhere.)

Lamport has mentioned the ambiguity several times since our conversation [eg, 4]. He has a warning on his web page warning against the ambiguity [1].

I brought this to the attention of Marc Brooker, probably in 2015, who wrote about it in his blog in 2019 [2].

I’d like to thank Deokgi Hong for reaching out to me and being the straw-that-broke-the-camels-back and inspiring me to revive this paper. He also reviewed an early draft of this paper. It was in this struggle for clarity that I was inspired to put actual names to the styles of Paxos (viz. Strong Accept Paxos andStrong Commit Paxos).

I’d also like to thank my mentors, explicit or otherwise, that inspired me to think critically; for this project they were Al Vermuelen, Marc Brooker, Chris Newcombe, and, of course, Leslie Lamport.

Sources

  1. Lamport’s official list of papers, including a reference to the ambiguity: https://lamport.azurewebsites.net/pubs/pubs.html#paxos-simple
  2. Marc Brooker writes about the ambiguity : https://brooker.co.za/blog/2021/11/16/paxos.html
  3. The question that started it all: https://stackoverflow.com/questions/29880949/contradiction-in-lamports-paxos-made-simple-paper
  4. Lamport mentions the ambiguity in a talk (timestamped): https://youtu.be/8-Bc5Lqgx_c?t=4398

--

--