The above two concepts are useful starting points for developing a solid understanding of core database concepts. I’m not going to reinvent the wheel, so first, read this quora post, which is the best summary of the above concepts that I’ve read. Now, let’s go a bit deeper.
The CAP Theorem is a useful way to consider the constraints on distributed systems. But it’s imperfect. First, the Quora post mentions that you can’t sacrifice partition tolerance in a reliable system. Let’s look at an example of why that’s the case. Consider the following system, in which each node is X, the master is X*, and a partition is |. The system involves a classic master-slave replication in which the master accepts writes and replicates to the read-only slaves.
Before the partition, everything is fine:
X X* X X X
After the partition, a subset of nodes must elect a new master in order to respond to write requests:
X X* | X X X
Or otherwise, the system will be trapped in a split-brain state, in which each set of nodes is unaware of the other, and operates independently.
Partition Tolerance involves detecting a partition, entering a partition tolerant state (such as rejecting writes), and recovering from a partition after the partition has been repaired. So in order to detect that there’s been a partition, a system must have partition tolerance. Therefore, pure CA is not possible in a distributed system! Note that the above example describes a multi-node failure, which can be caused by any hardware failures in a data center. In a single-node failure, requests can fail over to a replica in a transactionally consistent way, and the system can still respond.
Second, CAP doesn’t explicitly address latency. While availability describes a binary outcome in which a request is responded to or not during a system failure, latency describes a continuous spectrum of a system’s delay in responding to a request under normal operations. Of course, latency could be so high that the system is effectively unavailable, and the client gives up in some way such as timing out the request. Professor Abadi at Yale gives a great explanation of how the latency vs. consistency tradeoff provides a more nuanced model of distributed system performance during normal system operations (other than a partition). The three alternatives for implementing replication, and their variations, each come with a consistency-latency tradeoff:
- Data updates are sent to all replicas at the same time. Assuming multiple updates are submitted concurrently (from multiple clients), If the updates are not applied to all of the replicas in the same order, consistency could suffer. If a preprocessing layer ensures that the updates are applied to all of the replicas in the same order, the preprocessing step introduces extra latency.
- Data updates are sent to an agreed upon location first, usually called the “master node”. Once the master has committed the updates, it replicates the updates to all replicas. The replication can occur synchronously, asynchronously, or in some combination of the two. Again there’s a tradeoff — more asynchronous replication leads to more inconsistency as nodes could be read from before they receive the newest updates.
- Data updates are sent to a certain location depending on the data item or the state of the system (such as load balancer rerouting), and then propagated to the replicas.
So, CAP is a great heuristic for thinking about distributed systems, but should not be interpreted too rigidly.
I’ll add some clarifications to ACID. First, “a transaction is a sequence of operations performed as a logical unit of work” (Microsoft TechNet). The concept of a transaction makes atomicity possible by grouping commands so that they either all succeed or all fail.
Next, while the “C” in the CAP Theorem refers to each node in the system having the same data, the “C” in ACID refers to a single node enforcing the same rules on every potential commit, such as the type of a field or a field not being null.
Also, isolation, which is a database property that guarantees that one transaction will not depend on the results of a transaction that has not yet been committed, can be implemented in multiple ways, with varying degrees of strictness. From wikipedia, in decreasing order of isolation level:
- Serializable — all transactions require read and write locks, as well as range locks, and so the transactions occur one after the other, in a serialized fashion.
- Repeatable reads — “every lock acquired during a transaction is held for the duration of the transaction” (Percona). That means that reads of a set of rows that were read/written in any operation in a transaction are “repeatable” (will give the same result). However, there are some limitations to this isolation level.
“phantom reads” can occur in which the rows returned by two identical queries (aka “repeatable reads”) are different. For example:
/* Transaction 1, Query 1 */
SELECT * FROM users WHERE age BETWEEN 10 AND 30; /* While Transaction 1 is still in progress, Transaction 2, Query 2 occurs */
INSERT INTO users(id,name,age) VALUES ( 3, 'Bob', 27 ); COMMIT; /* Query 1 is executed again during Transaction 1, but its result will be different. */
SELECT * FROM users WHERE age BETWEEN 10 AND 30;
- Read committed — all transactions require write locks only. This isolation level means that if you write to a row during a transaction, that row will not be affected by any other transactions until the transaction is finished. However, rows that are read can be affected by other transactions, so reads are non-repeatable. Query 1 in Transaction 1 could give a different result from Query 2, because Transaction 2 could modify rows affected by Query 1. That could occur even without phantom reads.
- Read uncommitted — no locking, so in addition to reads being non-repeatable, dirty reads and dirty writes can occur. Dirty reads are when a transaction sees uncommitted changes made by other transactions.
The tradeoff on all of these isolation levels is speed of database operations vs. consistency. Less locks means more transactions can read/write a particular row in a given period of time, but also means that there could be more variation in reads/writes from the same row. The loosest level of isolation, read uncommitted, has the additional downside of transactions potentially reading data that is not committed. This could occur of Transaction 2 reads rows that Transaction 1 has modified, and then Transaction 1 is rolled back.
Knowing a one-sentence definition for each of the terms in the CAP and ACID acronym is important. Depending on your job, you may not have to go any deeper than that. With that in mind, I’ll end with my own one-sentence definitions:
CAP — describes distributed systems
C — consistency — different nodes respond with the same data to the same request
A — availability — the system responds to a request, even if the system isn’t working or its data is outdated
P — partition tolerance — the system detects, remains operational during, and heals a partition caused by the failure of one or many nodes
ACID — describes databases
A — atomicity — either all of the operations in a transaction succeed, or none of them do
C — consistency — the database enforces rules about its fields and the relationships between fields
I — isolation — the extent to which rows that a transaction is affecting can be affected by other transactions
D — durability — the database writes its data to a permanent medium (hard drive) so that data is not lost during a power failure or other system failure
Originally published at www.shanemkeller.com, January 2016.