ACID, CAP, and BASE

4 min readFeb 24, 2020

A lot has been said and published on these topics. “Designing Data-Intensive Applications” summarizes it all and, IMHO, in one of the most impactful ways.

A colleague of mine pointed out the other day that there never was an ACID Vs CAP battle. Nor an ACID Vs BASE or CAP Vs BASE. We cannot find any better way to state it. We will look into these concepts, in a concise manner, to assess which one is applicable when for ourselves.

Source: https://unsplash.com/photos/L1bAGEWYCtk

ACID

Transferring a $100 from account A to B, at a high-level, involves two steps -

Debit $100 from account A
Credit $100 to account B

However, a myriad of things can go wrong while performing these steps. For example,

The application server can crash after step 1.
The database can crash while performing step 2.
Two people, X and Y, can initiate it individually at the same time. etc.

Transactions (Principles of Transaction-Oriented Database Recovery) typically provide us the fundamental guarantees against these failures. And, they follow four basic properties to achieve it — Atomicity, Consistency, Isolation, and Durability.

Atomicity: If a set of operations is initiated as an atomic transaction, either everything succeeds or none. For example, when triggered as part of a transaction, if successful, both steps 1 and 2 while transferring that $100 from account A to B will take effect. Or, both fail. There won’t be a partial success.

Consistency: An employee ID in an organization must be unique for each employee. Every employee must have a department that s/he belongs to. The Consistency property ensures that these facts are maintained always. Relational database systems ensure it using unique keys, foreign key constraints, triggers, etc.

Isolation: It means that an ongoing transaction should not see nor be seen by other concurrently running ones. It, otherwise, would never be able to reset the state to its beginning. Relational database systems provide the choices of different isolation levels to tune it for specific needs — e.g. read uncommitted, repeatable read, etc.

Durability: Once written, i.e. committed, a transaction’s updates should not be lost. Database systems use write-ahead logs, hard disks, backups, etc. to achieve it.

So far so good. But only for single-machine data systems. The moment we introduce multiple machines forming a system, strictly obeying the ACID rules gets tricky. We need other sets of rules and theorems around the distributed systems. There come CAP and BASE.

CAP

In a distributed environment network partition(s) among the member nodes is a given. The CAP theorem states that in the event of a network partition, the system can either be Available or Consistent.

Consistency: In the context of CAP, Consistency refers to the fact that all the replicas that have a particular record must return the exact same value. This is not necessarily a physical guarantee. Data systems can choose to provide logical guarantees too. For example, using a quorum response.

Availability: All the active nodes at any moment must be able to respond to different operations.

Partition-tolerance: The system must be able to tolerate network partition among its participant nodes.

Cassandra is an AP system, Mongo DB is CP. MySQL is a CA system, but it’s not distributed.

BASE

Basically-Available: A distributed system should be available to respond with some acknowledgment — even if it’s a failure message, to any incoming request.

Soft-state: The system may keep changing states as and when it receives new information.

Eventually-consistent: The components in the system may not reflect the same value/state of a record at a given point in time. They will settle it with time, eventually, though.

For example, considering the following two interconnected but independent services, the following can be a series of high-level processing steps —

Basically-Available, Soft-state, Eventually-consistent

Customer places an Order.
The Order service updates the details in its local database. It marks the payment status as ‘payment_initiated’.
The Order service sends a message to the queue for Payment service to consume.
The Payment service receives the message, but for some reason, the payment processing fails. It updates the details in its local database. It also sends a message to the Order service with the details.
The Order service updates its payment record as ‘payment_failed’, sends the alternate payment links to the user.
Steps 3 and 4 are carried out again when the user retries the payment. And upon successful completion of the payment, the Payment service updates its details as ‘payment_complete’ and sends another message with the updates to the Order service.
The Order service updates its record as ‘payment_complete’.

In this example, the Payment service was able to respond to the Order service, though the processing itself failed the first time. The Order service had changed its local payment status based on the updates received from the Payment service. And the Order and the Payment service had different payment statuses and the final state was updated only after a while. This, by the way, is a commonly cited example of Choreographed SAGA Pattern and it follows the BASE principles.

Conclusion

Things slip from the scope of one set of rules to another as we move from a single machine toward a distributed space. ACID, CAP, and BASE can be looked at as separate rules/theorems/principles guiding us on our path of software design depending upon various traits of the system in consideration.

ACID, CAP, and BASE

ACID

CAP

BASE

Conclusion

Written by Pranabjyoti Bordoloi