An Overview of Databases — Part 3: CAP and BASE Theorem

Saeed Vayghani
3 min readJul 24, 2024

--

Part 1: DBMS Flow
Part 2: Non-Relational DB vs Relational

Part 3: CAP and BASE Theorem

Part 4: How to choose a Database?
Part 5: Different Solutions for Different Problems
Part 6: Concurrency Control
Part 7: Distributed DBMS
>> Part 7.1: Distributed DBMS (Apache Spark, Parquet + Pyspark + Node.js)
Part 8: Clocks
>> Part 8.1: Clocks (Causal Consistency With MongoDB)
>>
Part 8.2: Clocks (MongoDB Replica and Causal Consistency)
Part 9: DB Design Mastery
Part 10: Vector DB
Part 11: An interesting case, coming soon!

CAP and BASE Theorem:

CAP Theorem:

States that in a distributed database system, it is impossible to achieve all three of the following guarantees simultaneously:

1. Consistency: Data is the same across the cluster.
Every read receives the most recent write or an error. This means that all nodes in the system see the same data simultaneously. When data is written to the database, all subsequent reads should reflect that write.

2. Availability: The DB cluster is always available.
Every request (read or write) receives a response, without guarantee that it contains the most recent write. This ensures that the system remains operational and responsive even in the face of failures.

3. Partition Tolerance: DB continues functioning with a network partition.
The system continues to operate despite arbitrary partitioning due to network failures. This means that the system can continue functioning even if there is a loss of communication between some of the nodes in the system.

Tip: To get the combination of any two factors, you have to give up one.

1. CA (Consistency and Availability)
Definition
: Data is consistent across all the nodes
Constraint: As long as all nodes are online

2. CP (Consistency and Partition Tolerance):
Definition: Data is consistent, and maintains partition tolerance.
Constraint: Cluster becomes unavailable when a node goes down.

3. AP (Availability and Partition Tolerance):
Definition:
Nodes remain online even if there is a network partition
Constraint: It is not guaranteed that all nodes will have the same data (either during or after the partition)

Note: In real-world systems, network partitions are a reality, so most distributed databases choose to be partition-tolerant (P) and then decide between providing consistency © or availability (A):

CP Systems: Prioritize consistency over availability. Examples include traditional relational databases that ensure transactions are consistent, such as HBase.

AP Systems: Prioritize availability over consistency. Examples include NoSQL databases like Cassandra and DynamoDB, which are designed to be highly available even if the data might be slightly out-of-date.

BASE Theorem:

The BASE theorem is often considered an alternative to the CAP theorem, particularly for systems that prioritize availability and partition tolerance over strict consistency.
The BASE model is a more relaxed approach compared to the strict ACID properties in traditional databases.

DB will process operations at the moment they are available without waiting or any delay from other operations, and DB will cause Eventual Consistency, All-Time Consistency is not guaranteed.

Explanation:

1. BA: Basically Available

The system guarantees availability. This means that the system will always respond to requests, but the response might not always be immediately consistent.

2. S: Soft State

The state of the system may change over time, even without input. This reflects the fact that the system does not have to be consistent all the time, and the data may be stale.

3. E: Eventually Consistent

The system will eventually become consistent. Given enough time and no new updates, all replicas in the distributed system will converge to the same value.

Note: Any system without having consistency as its key factor can consider BASE principles. Ex: logging systems, feed platforms

Note: NoSQL databases such as MongoDB, and Redis are examples of the BASE properties.

--

--

Saeed Vayghani

Software engineer and application architecture. Interested in free and open source software.