CAP Theorem

5 min readJul 4, 2022

Introduction

While designing a distributed web service 3 properties are generally wanted:

Consistency
Availability
Partition-tolerance

But it is impossible to accomplish all of these at one time. Most web services provide strongly consistent data and they are expected to be highly available. Every request should succeed and receive a response. The web service must be a highly distributed network as it’s desirable to provide fault tolerance property which is the ability to survive a network partitioning into multiple components.

Data Consistency

The data must be ionic. There must exist a total order on all operations such that each operation looks as if it were completed in a single instant. This is equivalent to requiring requests of the distributed shared memory to act as if they were executing on a single node, responding to operations one at a time.

Availability

Every request received by a non-failing node in the system must result in a response, even if there is only one node available. So the data can be stale, but a goal is to serve with a nonfailing response.

Partition-tolerance

In case of network failure, if nodes split into two groups, the system should support the processing in both sub-groups, and the system should be tolerant to a network partition or drop in messages between nodes.

CAP Conjecture

That is impossible for a distributed system to have all these properties at the same time. We have left with a combination of 2 of these properties. As we know no distributed system is safe from network failure hence the partition tolerance should be there. So, we have left to choose one of the remaining properties: Consistency or Availability. If availability is not required so it’s easy to achieve data consistency. If atomic consistency is not required, it’s possible to provide high availability and partition tolerance, and if no partitions are needed we can choose consistency and availability.

Beating the CAP theorem

Partially-Synchronous Model

A central database does 2 things:

It maintains database node states in the system. It notifies the central node when a node makes an update or requests data.
It mediates the validity of responses to other nodes. After sending the update to the central node, if that central node doesn’t respond with an acknowledgement after a certain amount of time, the node that made the update assumes that the message has been lost, and lets the client know that.

This model improves overall consistency in a highly available and partition tolerant DBMS that will return a valid response as long as the central node is available. But the response may contain no data which violates atomic consistency.

Weak or Eventual Consistency

The partially synchronous model will lead to an eventually-consistent system as long as availability is maintained along with the nodes, and a system can then decide on the level of staleness that is acceptable in favour of system availability. The data is ultimately replicated to enough available nodes that, once all bits of data are delivered to each node, the system does reach consistency. There will just be gaps of time where the system is inconsistent, and the business must agree on how wide these gaps are allowed to be.

Less complexity

The complexity caused by the CAP theorem is a symptom of fundamental problems in how we approach building data systems. Two problems stand out in particular:

The use of mutable states in databases
The use of incremental algorithms to update that state.

We can beat the CAP theorem by preventing the complexity it normally causes. When you choose availability over consistency. In this case, the system is eventually consistent without any of the complexities of eventual consistency. Since the system is highly available, you can always write new data and compute queries.

By rejecting incremental updates, embracing immutable data, and computing queries from scratch each time, you’ll avoid such complexities. By batch/real-time architecture:

Batch computation: We need a Hadoop system which is can easily store a large and constantly growing dataset and can compute functions on that dataset in a scalable way. Hadoop is comprised of two pieces: a distributed filesystem (HDFS), and a batch processing framework (MapReduce). HDFS is good at storing a large amount of data across files in a scalable way. MapReduce is good at running computations on that data in a scalable way. We’ll store data in flat files on HDFS. Precomputing queries off of that data is similarly straightforward. MapReduce is an expressive enough paradigm such that nearly any function can be implemented as a series of MapReduce jobs. Tools like Cascalog, Cascading, and Pig make implementing these functions much easier. Finally, you need to index the results of the precomputation so that the results can be quickly accessed by an application. There’s a class of databases that are extremely good at this. ElephantDB and Voldemort read-only specialize in exporting key/value data from Hadoop for fast querying.
Realtime layer: The real-time system precomputes each query function for the last few hours of data. To resolve a query function, you query the batch view and the real-time view and merge the results together to get the final answer. The real-time layer is where you use read/write databases like Riak or Cassandra, and the real-time layer relies on incremental algorithms to update the state in those databases.

What makes scalable data systems difficult isn’t the CAP theorem. It’s a reliance on incremental algorithms and mutable state that leads to complexity in our systems.

It’s only recently with the rise of distributed databases that this complexity has gotten out of control. But that complexity has always been there.

Appendix

Consistency Models:

Strong Consistency: only one consistent state can be observed if all accesses are seen by all parallel processes.
Weak Consistency: Accesses to critical sections are seen sequentially if all accesses are seen by all parallel processes.
Atomic or Linearizable Consistency: if it consists of an ordered list of invocation and response events, that may be extended by adding response events such that: The extended list can be re-expressed as a sequential history. That sequential history is a subset of the original unextended list.
t-Connected Consistency: This guarantee allows for some stale data when messages are lost, but provides a time limit on how long it takes for consistency to return, once the partition heals. This definition can be generalized to provide consistency guarantees when only some of the nodes are connected and when connections are available only some of the time
Eventual consistency: If no new updates are made to a given data item, eventually, all accesses to that item will return the last updated value.