An Overview of Databases — Part 7: Distributed DBMS

18 min readAug 1, 2024

Part 1: DBMS Flow
Part 2: Non-Relational DB vs Relational
Part 3: CAP and BASE Theorem
Part 4: How to choose a Database?
Part 5: Different Solutions for Different Problems
Part 6: Concurrency Control
Part 7: Distributed DBMS
>> Part 7.1: Distributed DBMS (Apache Spark, Parquet + Pyspark + Node.js)
Part 8: Clocks
>> Part 8.1: Clocks (Causal Consistency With MongoDB)
>> Part 8.2: Clocks (MongoDB Replica and Causal Consistency)
Part 9: DB Design Mastery
Part 10: Vector DB
Part 11: An interesting case, coming soon!

Distributed DBMS

A distributed DBMS divides a single logical database across multiple physical resources.
An important goal in designing a distributed DBMS is fault tolerance.

A list of most important topics in Distributed DBMSs:

Two Types of Distributed DBMSs
Distributed System Architecture
DATA FILE FORMAT
Distributed Nodes Diversity
‌DB Partitioning (Sharding)
Consistency Models
Replication Strategies
Fault Tolerance and Recovery
Transaction

Two Types of Distributed DBMSs:

Parallel Database:

Nodes are physically close to each other.
Nodes are connected via high-speed LAN (fast, reliable).
The communication cost between nodes is small. As such, one does not need to worry about network issues.

Distributed Database:

Nodes can be far from each other.
Nodes are potentially connected via a public network, which can be slow and unreliable.
The communication cost and connection problems cannot be ignored.

Distributed System Architecture

A distributed DBMS’s system architecture specifies the location of the database’s data files. There are three approaches:

1. Push Query to Data

Send the query (or a portion of it) to the node that contains the data.
Perform as much filtering and processing as possible where data resides before transmitting over network.
Sample: Google BigTable, Apache Cassandra, Amazon Redshift

2. Pull Data to Query

Bring the data to the node that is executing a query that needs it for processing.
This is necessary when there is no compute resources available where database files are located.
Sample: - Apache Spark SQL, Filtering and retrieving data using Amazon S3 Select.

3. Hybrid

Sample: Elasticsearch uses a hybrid approach:

Push Query to Data: Queries are distributed to the shards where data resides. Shards process these queries locally, minimizing data transfer.
Partial Pull for Aggregation: For complex aggregations or sorting, intermediate results are pulled to the coordinating node, which performs the final aggregation.

Lets take a look to a sample Elastic configuration and a simple app to demonstrate the hybrid approach:

// Elasticsearch Configuration File (elasticsearch.yml)

cluster.name: my-cluster
node.name: node-1
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
network.host: 127.0.0.1
http.port: 9200

# Define the number of shards and replicas
index.number_of_shards: 3
index.number_of_replicas: 1

# Enable debugging to see the distribution of queries
logger.action.level: debug

const { Client } = require('@elastic/elasticsearch');
const client = new Client({ node: 'http://localhost:9200' });

async function main() {
  // 1. Create an Index
  await client.index({
    index: 'logs',
    document: {
      timestamp: new Date(),
      user: 'john wick',
      message: 'Looking for freedom'
    }
  });

  await client.index({
    index: 'logs',
    document: {
      timestamp: new Date(),
      user: 'Winston',
      message: 'Looking for revenge'
    }
  });

  await client.indices.refresh({ index: 'logs' });

  // 2. Push Query to Data
  const result = await client.search({
    index: 'logs',
    body: {
      query: {
        match: { message: 'looking' }
      },
      sort: [{ timestamp: { order: 'desc' }}]
    }
  });

  console.log('Search Results:', result.hits.hits);

  // 3. Partial Pull for Aggregation
  const aggregationResult = await client.search({
    index: 'logs',
    body: {
      aggs: {
        users: {
          terms: {
            field: 'user.keyword'
          }
        }
      }
    }
  });

  console.log('Aggregation:', aggregationResult.aggregations.users.buckets);
}

main().catch(console.log);

DATA FILE FORMAT

Choosing the right data file format is crucial in a distributed DBMS as it impacts: performance, scalability, interoperability, data integrity, security, cost efficiency, and query optimization.

A list of well known file formats:

Apache Parquet: Compressed columnar storage from Twitter.
Apache ORC:Compressed columnar storage from Apache Hive.
Apache Iceberg: Supports schema evolution from Netflix.
HDF5: Multi-dimensional arrays for scientific workloads.
Apache Arrow: In-memory compressed columnar storage.

Lets take a look into some scenarios that bold the importance of file formats:

1: Performance Optimization (Efficient Storage and Retrieval)

Columnar formats are ideal for analytical queries, allowing efficient access to specific columns without reading the entire row.
Row-based formats are ideal for transactional queries where entire rows are frequently accessed.

2: Scalability (Handling Large Datasets)

Formats like Parquet and ORC are designed to handle petabyte-scale datasets, making them suitable for distributed systems.
Data file formats that support partitioning enables parallel processing.

3: Security (Encryption and Access Control)

Some file formats support encryption, ensuring data is secure both at rest and in transit.

4: Query Optimization (Indexing and Predicate Pushdown)

This optimization reduces the amount of data read from disk and transferred over the network
Tip: PostgreSQL leverages predicate-pushdown primarily through foreign data wrappers and index scans, allowing it to optimize query.

Distributed Nodes Diversity

In a distributed database management system (DDBMS), nodes are the individual machines or instances that collectively form the entire system. The diversity of these nodes can be categorized into two types: homogeneous and heterogeneous nodes.

Choosing between homogeneous and heterogeneous nodes depends on the specific requirements and constraints of your distributed system. Understanding the trade-offs between these two approaches helps in designing a distributed database system that best meets your organization’s needs.

Homogenous Nodes (Uniform Nodes or Symmetric Nodes):

Every node in the cluster can perform the same set of tasks (albeit on potentially different partitions of data).
Makes provisioning and fail over easier.
Ideal for environments prioritizing simplicity, uniformity, and predictability.
Example: An organization uses a distributed MySQL database on multiple servers, all with the same CPU, memory, and storage configurations.

2. Heterogeneous Nodes (Non-uniform Nodes or Asymmetric Nodes):

Nodes are assigned specific tasks.
Allows for the integration of various systems and optimization for specific tasks or workloads.
Ideal for environments needing flexibility, optimized resource utilization, and the ability to handle diverse workloads efficiently.
Example: A distributed database system where some nodes run on powerful cloud servers with high CPU and memory for computational tasks, while others run on Storage optimized machines with large disk storage for capacity-oriented tasks.

‌DB Partitioning (Sharding)

It is a simple concept, Split database across multiple resources: Disks, nodes, processors. The DBMS executes query fragments on each partition and then combines the results to produce a single answer.

The DBMS can partition a database physically (shared nothing) or logically (shared disk). Three different approaches:

1. Naive partitioning

Assign an entire table to a single node.
Ideal if queries never join data across tables stored on different nodes and access patterns are uniform.
When quick and simple partitioning is needed without in-depth analysis of data access patterns.

2. Vertical partitioning

Split a table’s attributes into separate partitions.
Column-Specific Workloads, When certain columns are frequently accessed together, but other columns are rarely accessed.
When dealing with very wide tables that have many columns

3. Horizontal partitioning

When the dataset grows beyond the capacity of a single database server, horizontal partitioning allows the data to be distributed across multiple servers.
Split a table’s tuples into disjoint subsets based on some partitioning key and scheme.
When the workload needs to be balanced across multiple databases.

Data Distribution Strategies

In a distributed database system, how data is distributed across multiple nodes significantly impacts performance, fault tolerance, and scalability. The following are three primary strategies for data distribution:

1. Hash Partitioning

Hash partitioning involves distributing data across nodes based on the hash value of a key (typically a primary key or a unique identifier). A hash function is applied to the key to determine the partition or node where the data should reside.

How it works:

A hash function takes the key as input and produces a hash value.
The hash value is then mapped to one of the available partitions/nodes.

Advantages:

Uniform Distribution: If the hash function is well-designed, data is evenly distributed across all nodes, balancing the load.
Scalability: Adding or removing nodes is relatively straightforward, though it may require rehashing of data.
Efficient Lookups: Hash partitioning allows quick lookups since the location of any data item can be directly computed.

Disadvantages:

Rebalancing Overhead: When nodes are added or removed, data might need to be redistributed, which can be costly.
Complex Range Queries: Handling range queries (e.g., finding all records between two values) is more complex compared to range partitioning, as it may involve querying multiple nodes.

2. Range Partitioning

Description: Range partitioning distributes data across nodes based on ranges of values of a partition key. Each node is responsible for a specific range of values.

How it works:

The data space is divided into continuous, non-overlapping ranges.
Each range is assigned to a specific node.
For example, if partitioning based on customer IDs, one node might store IDs from 1 to 1000, another from 1001 to 2000, and so on.

Advantages:

Efficient Range Queries: Range queries are more efficient because the query can be directed to the specific nodes that contain the required ranges.
Predictable Data Distribution: Easy to predict and understand where specific data items will be located based on their key values.

Disadvantages:

Skewed Data Distribution: If the data is not uniformly distributed, some nodes may become hotspots, leading to load imbalance.
Rebalancing Complexity: Adding or removing nodes requires adjusting the ranges, which can be complex and may require moving significant amounts of data.

3. Round-Robin Partitioning

Description: Round-robin partitioning distributes data evenly across nodes in a cyclic manner, regardless of the data content.

How it works:

Data items are assigned to nodes in a rotating sequence.
For instance, the first data item goes to node 1, the second to node 2, and so on. After the last node, the assignment cycles back to the first node.

Advantages:

Simple and Uniform Distribution: Ensures a uniform distribution of data across all nodes, which helps balance the load evenly.
No Need for Complex Hashing or Ranges: Simplifies the partitioning logic and avoids issues with hash functions or range definitions.

Disadvantages:

Inefficient for Range Queries: Range queries are inefficient since data is not stored in any specific order, requiring queries to scan multiple nodes.
Limited Flexibility: Lacks flexibility for more sophisticated distribution strategies based on data characteristics or access patterns.

Choosing the Right Strategy

Selecting the appropriate data distribution strategy depends on the specific requirements and characteristics of the application:

Hash Partitioning is suitable for applications requiring uniform distribution and quick lookups.
Range Partitioning works well for applications with frequent range queries and predictable data distribution.
Round-Robin Partitioning is ideal for applications needing simple and even load distribution without complex partitioning logic.

Consistency Models

Consistency models define the rules for how and when changes to the database become visible to users and other parts of the system. These models are critical in distributed systems as they determine the behavior and guarantees provided by the database in terms of data consistency, availability, and partition tolerance. Here are the main consistency models:

1. Eventual Consistency

Eventual consistency ensures that, if no new updates are made to a given piece of data, all replicas of that data will eventually converge to the same value. This model does not guarantee immediate consistency but promises that, given enough time, all updates will propagate through the system and all replicas will become consistent.

How it works:

When a change is made to the data, it is propagated to all replicas asynchronously.
During this propagation period, different replicas may return different values for the same data.
Eventually, all replicas will be updated, and any read operation will return the same value.

Advantages:

High Availability: Provides high availability since updates are accepted even if some replicas are unavailable.
Scalability: Well-suited for large-scale systems where immediate consistency is less critical.

Disadvantages:

Temporary Inconsistency: Users may see stale data until all replicas converge.
Complexity in Conflict Resolution: Handling conflicts and ensuring data convergence can be complex.

Use Cases:

Social media feeds (e.g., posts, likes).
DNS systems.

2. Strong Consistency

Strong consistency guarantees that all operations are completed in a strictly serializable order, meaning that once an update is acknowledged, all subsequent reads will reflect that update. This model ensures that any read operation returns the most recent write for a given piece of data.

How it works:

A write operation is propagated to all replicas synchronously.
The write is acknowledged only after all replicas have applied the update.
Any subsequent read operation will return the value of the most recent write.

Advantages:

Immediate Consistency: Users always see the most recent data.
Simplified Application Logic: Developers do not need to handle stale data scenarios.

Disadvantages:

Higher Latency: Synchronous replication can introduce higher latency.
Reduced Availability: Requires all replicas to be available for writes, which can reduce availability in the face of network partitions.

Use Cases:

Banking systems.
E-commerce transactions.

3. Causal Consistency

Causal consistency ensures that operations that are causally related are seen by all nodes in the same order. Causal relationships between operations are preserved, but operations that are not causally related can be seen in different orders by different nodes.

How it works:

Operations are tagged with causal metadata that tracks dependencies between operations.
When an operation is performed, its causal dependencies are propagated to all replicas.
Replicas apply operations in a way that respects the causal order.

Advantages:

Maintains Causal Relationships: Ensures that causally related operations are seen in a consistent order.
Balance Between Consistency and Availability: Offers a middle ground between eventual consistency and strong consistency.

Disadvantages:

Complex Metadata Management: Requires tracking and managing causal dependencies, which can be complex.
Partial Ordering: Only ensures partial ordering, so some inconsistencies may still be observed.

Use Cases:

Collaborative editing (e.g., document editing).
Messaging systems where order of messages matters.

Choosing the Right Consistency Model

The choice of consistency model depends on the specific requirements of the application and the trade-offs between consistency, availability, and performance. Here are some considerations:

Eventual Consistency: Ideal for applications where high availability and partition tolerance are more important than immediate consistency.
Strong Consistency: Suitable for applications where correctness and immediate consistency are critical, even at the cost of higher latency and reduced availability.
Causal Consistency: Works well for applications that need to maintain causal relationships between operations but can tolerate some level of inconsistency.

Replication Strategies

Replication is a critical feature in distributed databases, designed to enhance data availability, fault tolerance, and performance by copying data across multiple nodes. Here are the two primary replication strategies:

1. Master-Slave Replication

In the master-slave replication model, a single node (the master) handles all write operations. The master then propagates changes to one or more secondary nodes (the slaves). The slaves typically handle read operations to balance the load and improve performance.

How it works:

Write Operations: All write operations are directed to the master node.
Propagation: The master node synchronizes changes to the slave nodes, usually through asynchronous replication.
Read Operations: Read operations are distributed among the slave nodes to balance the load and improve read performance.

Advantages:

Simplicity: The master-slave model is relatively simple to implement and manage.
Read Scalability: By offloading read operations to slave nodes, the system can handle a higher number of read requests.
Data Integrity: Ensures data consistency since all writes are centralized at the master node.

Disadvantages:

Single Point of Failure: The master node is a single point of failure; if it goes down, write operations cannot be performed until the master is restored or a new master is designated.
Write Scalability: Write operations are limited by the capacity of the master node, which can become a bottleneck.
Latency: There can be a delay in propagating changes from the master to the slave nodes, leading to temporary read inconsistencies.

2. Multi-Master Replication

In the multi-master replication model, multiple nodes (masters) can accept write operations and propagate changes to each other. This model allows for a more distributed and decentralized approach to data management.

How it works:

Write Operations: Write operations can be performed on any master.
Synchronization: Changes are propagated among all master nodes to ensure consistency. This can be done synchronously or asynchronously.
Conflict Resolution: Mechanisms must be in place to handle conflicts that arise from concurrent writes on different master nodes.

Advantages:

High Availability: Multiple master nodes ensure that the system remains operational even if one or more nodes fail.
Write Scalability: Write operations are distributed across multiple nodes, reducing the load on any single node and enhancing scalability.
Flexibility: Supports geographically distributed nodes, allowing write operations to be performed closer to the user, reducing latency.

Disadvantages:

Complexity: Managing and synchronizing multiple master nodes is more complex, especially in terms of conflict resolution.
Consistency Challenges: Ensuring data consistency across all master nodes can be challenging, particularly with asynchronous replication.
Potential for Conflicts: Concurrent writes to different master nodes can lead to conflicts that must be resolved, adding to the complexity.

Use Cases:

Applications requiring high availability and write scalability, such as globally distributed applications, collaborative platforms, and online transaction processing (OLTP) systems.

Choosing the Right Replication Strategy

The choice of replication strategy depends on the specific requirements of the application, such as read/write load, availability needs, geographical distribution, and tolerance for complexity and conflicts:

Master-Slave Replication: Suitable for applications with high read-to-write ratios and where simplicity and ease of management are priorities.
Multi-Master Replication: Ideal for applications requiring high availability, write scalability, and support for geographically distributed nodes.

Fault Tolerance and Recovery

Fault tolerance and recovery mechanisms are essential for ensuring the reliability and availability of distributed database systems. These mechanisms help the system to continue functioning in the presence of failures and recover data to a consistent state after a failure. Two primary techniques used for fault tolerance and recovery are checkpointing and logging/journaling.

1. Checkpointing

Checkpointing involves periodically saving the state of the database to a stable storage. In the event of a failure, the system can be restored to the state captured at the last checkpoint, thereby minimizing data loss and recovery time.

How it works:

Periodic Snapshots: At regular intervals, a snapshot of the entire database or specific critical parts is taken and stored.
Consistency: The system ensures that the snapshot represents a consistent state of the database.
Recovery: In case of a failure, the system can be restored to the state captured in the most recent checkpoint, and any transactions that occurred after the checkpoint can be reapplied using logs.

Advantages:

Reduced Recovery Time: Restoring from a checkpoint is faster compared to reconstructing the entire database from logs.
Simplicity: Checkpointing is straightforward to implement and manage.
Efficiency: Only the changes since the last checkpoint need to be reapplied, reducing the amount of work required during recovery.

Disadvantages:

Storage Overhead: Requires additional storage to save periodic snapshots of the database.
Performance Impact: Taking a checkpoint can introduce performance overhead, especially in large databases.
Data Loss: Any transactions performed after the last checkpoint and before the failure may be lost and need to be reapplied from logs.

Use Cases:

Systems where quick recovery is essential, such as online transaction processing (OLTP) systems and financial databases.

2. Logging and Journaling

Logging (or journaling) involves recording every change made to the database in a log file. This log file can be used to reconstruct the state of the database in the event of a failure, ensuring that all committed transactions are preserved.

How it works:

Write-Ahead Logging (WAL): Before any changes are applied to the database, they are first written to a log file.
Transaction Logs: Each transaction is recorded in the log file, including details of the changes made and the transaction’s commit status.
Recovery Process: In case of a failure, the system uses the log file to redo committed transactions and undo incomplete transactions, ensuring data consistency.

Advantages:

Fine-Grained Recovery: Allows precise recovery of the database to the exact point of failure.
Data Integrity: Ensures that all committed transactions are preserved and incomplete transactions are rolled back.
Minimal Data Loss: Only uncommitted transactions are lost, minimizing data loss during recovery.

Disadvantages:

Performance Overhead: Writing to the log file introduces additional write operations, which can impact performance.
Complexity: Managing and replaying logs can be complex, especially in high-throughput systems.
Storage Overhead: Requires additional storage for log files, which can grow large in active systems.

Use Cases:

High-availability systems where preserving every committed transaction is crucial, such as banking systems and e-commerce platforms.

Combining Checkpointing and Logging

In many systems, checkpointing and logging are used together to provide a robust fault tolerance and recovery mechanism:

Checkpointing: Provides a base state to quickly restore the database.
Logging: Ensures that all changes since the last checkpoint are preserved and can be reapplied to bring the database to its most recent consistent state.

Recovery Process:

Restore from Checkpoint: The system is restored to the state captured in the last checkpoint.
Reapply Logs: Changes recorded in the log file since the last checkpoint are reapplied to bring the database to its current state.

This combination leverages the strengths of both techniques, ensuring quick recovery with minimal data loss and ensuring data consistency and integrity.

Transactions

Handling transactions in distributed databases is more complex than in single-node databases due to the need to ensure ACID properties across multiple nodes.

1. Two-Phase Commit Protocol (2PC)

The Two-Phase Commit Protocol is a distributed algorithm used to ensure all participating nodes in a distributed transaction agree on whether to commit or abort the transaction. It is the most common protocol for ensuring atomicity in distributed transactions.

-- Begin transaction
BEGIN;

-- Perform operations
UPDATE table_1 SET balance = balance - 10 WHERE account_id = 'account_id_1';
UPDATE table_2 SET balance = balance + 10 WHERE account_id = 'account_id_2';

-- Prepare transaction
PREPARE TRANSACTION 'my_distributed_transaction';

-- Assuming everything is correct, commit the transaction
COMMIT PREPARED 'my_distributed_transaction';

How it works:

Phase 1: Prepare:

The coordinator node sends a prepare request to all participant nodes, asking them to prepare to commit the transaction.
Each participant node executes the transaction up to the point where it is ready to commit and writes the changes to a local log but does not make the changes permanent.
Each participant replies to the coordinator with a vote: either “ready” (if it can commit) or “abort” (if it cannot commit).

Phase 2: Commit:

If all participants vote “ready,” the coordinator sends a commit request to all participants.
Each participant then commits the transaction and sends an acknowledgment to the coordinator.
If any participant votes “abort,” the coordinator sends an abort request to all participants, and each participant rolls back the transaction.

Advantages:

Atomicity: Ensures that the transaction is either fully committed across all nodes or fully aborted, maintaining atomicity.
Simplicity: Provides a straightforward mechanism to achieve distributed consensus.

Disadvantages:

Blocking: If the coordinator fails during the protocol, participants may be left in a blocking state, waiting for the coordinator to recover.
Performance Overhead: The two-phase process introduces latency and can impact performance.

2. Three-Phase Commit Protocol (3PC)

The Three-Phase Commit Protocol is an extension of the 2PC designed to reduce the blocking issues by adding an additional phase to the commit process.

How it works:

Phase 1: Prepare:

Similar to 2PC, the coordinator sends a prepare request to all participants, who then vote “ready” or “abort.”

Phase 2: Pre-Commit:

If all participants vote “ready,” the coordinator sends a pre-commit message.
Participants acknowledge receipt of the pre-commit message and prepare to commit but do not commit yet.

Phase 3: Commit:

The coordinator sends a commit message, and participants commit the transaction.
If any participant votes “abort,” the coordinator sends an abort message, and participants roll back the transaction.

Advantages:

Non-blocking: Designed to avoid blocking by ensuring participants do not wait indefinitely if the coordinator fails.
Improved Fault Tolerance: The additional phase reduces the likelihood of participants being left in an uncertain state.

Disadvantages:

Increased Complexity: More complex than 2PC and requires more messaging between nodes.
Performance Overhead: Additional phase increases the number of messages and the overall transaction time.

3. Distributed Lock Management

Description: Distributed lock management ensures that transactions acquire the necessary locks on data across multiple nodes to maintain isolation and consistency.

How it works:

Lock Acquisition: Before a transaction can proceed, it must acquire locks on all necessary data items across the distributed nodes.
Lock Coordination: A distributed lock manager coordinates the acquisition and release of locks to prevent conflicts and ensure isolation.
Deadlock Detection: Mechanisms are in place to detect and resolve deadlocks that may occur when multiple transactions compete for the same resources.

Advantages:

Consistency and Isolation: Ensures that transactions do not interfere with each other, maintaining consistency and isolation.
Scalability: Distributed lock management can scale with the system, allowing multiple transactions to proceed concurrently.

Disadvantages:

Complexity: Implementing and managing distributed locks is complex and requires efficient coordination.
Performance Impact: Locking can introduce contention and reduce system performance, especially under high load.

4. Timestamp Ordering

Description: Timestamp ordering assigns timestamps to transactions and uses these timestamps to order operations, ensuring consistency and isolation without the need for locks.

How it works:

Timestamp Assignment: Each transaction is assigned a unique timestamp based on its start time.
Operation Ordering: Transactions are ordered based on their timestamps, ensuring that older transactions are processed before newer ones.
Conflict Resolution: If a conflict is detected (e.g., a newer transaction tries to access data being modified by an older transaction), the conflict is resolved based on the timestamps.

Advantages:

No Locking Overhead: Eliminates the need for locks, reducing contention and improving performance.
Simplified Deadlock Handling: Avoids deadlocks since transactions are ordered and cannot wait for each other.

Disadvantages:

Potential for Abort: Transactions may need to be aborted and restarted if conflicts cannot be resolved based on timestamps.
Complexity in Implementation: Requires precise timestamp management and conflict detection mechanisms.

What other topics we need to consider when it comes to designing a Distributed Database System?

Distributed Query Processing
Concurrency Control Mechanisms
Data Placement and Fragmentation
Scalability and Load Balancing
Monitoring and Performance Tuning
Distributed Indexing
Schema Management in Distributed Databases
Security in Distributed DBMS
PostgreSQL Partitioning and Locks + Toast Table

I will cover them later in some extra blog posts…

An Overview of Databases — Part 7: Distributed DBMS

Distributed DBMS

A list of most important topics in Distributed DBMSs:

Two Types of Distributed DBMSs:

Distributed System Architecture

DATA FILE FORMAT

Lets take a look into some scenarios that bold the importance of file formats:

Distributed Nodes Diversity

‌DB Partitioning (Sharding)

Data Distribution Strategies

1. Hash Partitioning

2. Range Partitioning

3. Round-Robin Partitioning

Choosing the Right Strategy

Consistency Models

1. Eventual Consistency

2. Strong Consistency

3. Causal Consistency

Choosing the Right Consistency Model

Replication Strategies

1. Master-Slave Replication

2. Multi-Master Replication

Choosing the Right Replication Strategy

Fault Tolerance and Recovery

1. Checkpointing

2. Logging and Journaling

Combining Checkpointing and Logging

Transactions

1. Two-Phase Commit Protocol (2PC)

2. Three-Phase Commit Protocol (3PC)

3. Distributed Lock Management

4. Timestamp Ordering

What other topics we need to consider when it comes to designing a Distributed Database System?

Written by Saeed Vayghani