PACELC Theorem Explained: Distributed Systems Series

Lohith Chittineni
Distributed Systems Series
8 min readOct 6, 2023

Hi! In this article I’m going to further explore the trade-offs that many systems designers face in their distributed systems and databases by going over the PACELC Theorem(“pass-elk”) which extends the popular CAP Theorem. If you aren’t familiar with CAP Theorem checkout my previous article which provides an in-depth summary of what is and why it is useful!

CAP Theorem Explained: Distributed Systems Series

Now let’s dive in! As a refresher, what is a distributed system?

“A distributed system can be defined as a network of computers that work together to provide services or solve problems. These computers in the network are able to communicate with each other in order to execute tasks and applications.”

And what is CAP Theorem?

“ In a network subject to communication failures, it is impossible for any web service to implement an atomic read/write shared memory that guarantees a response to every request.”[ Gilbert and Lynch ]

But what is PACELC Theorem? PACELC(“pass-elk”) is an acronym that stands for

  • Partition
  • Availability
  • Consistency
  • Else
  • Latency
  • Consistency

And the theorem states …

“ if there is a partition (P), how does the system trade off availability and consistency (A and C); else (E), when the system is running normally in the absence of partitions, how does the system trade off latency and consistency (L and C)” [Daniel J. Abadi]

Now that we know what PACELC Theorem is, why does this exist?

Taking a look back at CAP Theorem, we saw the scenario where in the absence of partition tolerance, meaning we are operating in a stable network and no connection between nodes go down, we are able to offer both full consistency and availability(CA) to the system. However we mention this was not reasonable in practice, and assume that partitions and network failures are guaranteed therefore we focus more on available and partition tolerant(AP) systems or consistent and partition-tolerant(CP) systems.

But let’s look again at a CA system. If there are no network failures or instability and we do not have to worry about partitions, are there really no other tradeoffs to think about? These are some of the doubts many have with CAP Theorem, and which is why PACELC was introduced. Since we have already talked about the tradeoffs between ‘availability’ and ‘consistency’ we can now focus our attention on the characteristics of the latter half of PACELC when network partitions do not exist, ‘latency’ and ‘consistency’.

What is latency? Latency can be defined as ‘the delay or time it takes for data to travel from its source to its destination. The time interval between initiating a request or action and receiving a response or result’. Latency is a characteristic that is always present in your system whether a partition exists or not. If you look at the way ‘availability’ is described in CAP Theorem, full availability can be considered a system that is measured by 0 latency and no availability would represent 100% latency. And based on PACELC, latency is provided more attention when a partition does not exist. This aligns with the previous point made about CAP theorem where in a lot of systems you will not see full availability, yet instead a best effort which latency helps measure.

Now lets take a look at a few example system designs of how these tradeoffs can look between latency and consistency! All diagrams start at the top and read left to right. In these diagrams we use the following definitions for synchronous and asynchronous…

Synchronous: Once a request is sent it requires a response in order for the next tasks to continue(blocking).

Asynchronous: Once a request is sent it does not require a response immediately and other tasks can continue(non-blocking).

Preprocessing Data Replication System Chart

Preprocessing Data Replication (Consistent, High Latency)

In this first example we show what the system could look like for data replication using a preprocessing system for each request from the client nodes. As the requests go through the preprocessing system all the replica nodes will agree on how to process the update requests and will modify their values accordingly following the same order. This will maintain consistency across the replica nodes but can contain sources of increased latency. A source of latency in this system is that the requests now have to go through the preprocessor where it will take time to order or sequence multiple requests. It will also increase latency because you are no longer sending requests directly to the data nodes and the preprocessor system could be geographically much farther from the client, increasing the latency.

Synchronous Data Replication System Chart

Synchronous Data Replication (Consistent, High Latency)

In this system, we introduce the idea of a primary node that accepts all write requests and replica/secondary nodes that accept all read requests. This is a synchronous data replication because after a write request is made to the primary node, updates will be made to all the replica nodes and the primary node will wait until all those replicas have successfully updated. As shown in the diagram, we can not make a read request until all synchronous updates are made. This also increases consistency but will increase latency because it will require waiting on all the replica nodes being updated and some nodes may be much farther from the primary node so the system is limited by the time taken to update the farthest node.

Asynchronous Data Replication I System Chart

Asynchronous Data Replication I (Consistent, High Latency)

Now lets take a look at a system where we would send asynchronous requests. In this system once the primary node receives the write request it will make asynchronous updates to all the replica nodes. This means the primary node will not wait for a response from the replica node that the replication has completed and will assume it has completed once it sends the update request. In this system, the primary node is the one responsible for processing all the read and write requests of the system. Once a client makes a read request, whether it goes through the data replica node or not it will be rerouted to the primary node. As a result consistency is maintained even through asynchronous updates. However, latency in this scenario is increased because we will have to wait for our requests to be routed all the way to the primary node before we receive a response.

Asynchronous Data Replication II System Chart

Asynchronous Data Replication II (Inconsistent, Low Latency)

Let’s look at another asynchronous replication scenario. This is similar to the first one in that the primary node still performs asynchronous updates to all the replica nodes, but now the primary node is also open for all read requests as well. In this example we show that after the write request has been made from client 1, client 2 makes a read request and the system sends the request to both the primary node and a replica node. In this case where the values differ, the system will need to maintain some sequence number that will signal which value is the latest one. The system overall may have inconsistent reads but the retrieved value will be correct in an effort to mitigate this inconsistency. However, the latency of this system is much better because we no longer need to rely on acknowledgement of the primary node for any or all replica updates. Note in this example we are showing reads and writes from the primary node but these tradeoffs hold true even if we allow writes across multiple nodes as well.

Asynchronous Data Replication III System Chart

Asynchronous Data Replication III (Medium Consistency, Medium Latency)

What if instead of doing only synchronous replication or only asynchronous replication we tried a hybrid of both? That’s what this system diagram portrays. In this scenario the primary node will send synchronous update calls to a subset of replica nodes and asynchronous calls to all other replicas nodes. This tries to create a middle ground where we don’t fully sacrifice consistency or latency for the other. For this system we determine the number of nodes that take in synchronous reads and writes using the formula:

R + W > N, R + W ≤ N

R: synchronous READs

W: synchronous WRITEs

N: number of Replica Nodes

If the equation is R+W>N then consistency is maintained but is still subject to latency overhead. If it is R+W ≤ N latency is reduced but it is possible to retrieve responses from asynchronous updates which could lead to inconsistencies of the data. With this bound in place the system will either sacrifice a little more consistency or a little more latency for the other and is really trying to achieve a best effort of both properties.

From these examples it can be seen that many factors go into the weighing the costs between consistency and latency. They all offer their own pros and cons when some properties are given more priority than others.

  • Deciding between synchronous and asynchronous calls between data nodes
  • Taking into account geographical distance between nodes for routing data
  • Choosing how many nodes should be available for reads and writes in the cluster

Hopefully, this gave some insight into some of the compromises between consistency and latency when designing a distributed system and shows there is no one sure cut way to handle these bottlenecks. Both CAP and PACELC theorem do not provide an absolute answer for creating your system, and it is up to designers to weigh the trade-offs that they introduce. PACELC serves as an extension of CAP Theorem and proposes more questions about other potential faults and downsides that you will need think about when designing a system.

For a more in-depth discussion about PACELC Theorem make sure to checkout Daniel J. Abadi’s original paper. Also checkout the original blog post where he discusses the motivation behind introducing PACELC.

Thanks for reading!

--

--