Concurrency Control Mechanisms in Distributed Systems

Roopa Kushtagi
4 min readJun 17, 2023

--

Several topics like this are discussed on my YouTube channel. Please, visit. Appreciate your support.

Concurrency is the ability of a system to execute multiple tasks simultaneously.

Many of us confuse concurrency with parallelism. But, not many know that Concurrency Is Not Parallelism. You can learn more about this here.

There are two main mechanisms for concurrency control.

· Optimistic Concurrency Control (OCC):

OCC is a concurrency control mechanism that allows concurrent execution of transactions without acquiring locks upfront. It assumes that conflicts between transactions are infrequent, and transactions proceed optimistically. During the commit phase, conflicts are detected, and if conflicts occur, appropriate actions such as aborting and retrying the transaction are taken.

In a distributed system, OCC can be implemented by maintaining version information for each data item. Each transaction reads a consistent snapshot of the database at the beginning, and during the commit phase, it checks if any other transaction has modified the same data items it has read. If conflicts are detected, the transaction is rolled back and retried with a new snapshot.

· Pessimistic Concurrency Control (PCC):

PCC is a concurrency control mechanism that assumes conflicts are likely to occur and takes a pessimistic approach by acquiring locks on resources upfront to prevent conflicts. It ensures that transactions acquire exclusive access to resources, preventing other transactions from modifying or accessing them until the locks are released.

In a distributed system, PCC can be implemented by using distributed locks or lock managers. When a transaction wants to access a resource, it requests a lock on that resource from the lock manager. If the lock is available, it is granted, and the transaction proceeds. If the lock is not available, the transaction waits until the lock is released.

Concurrency Control Mechanisms.

Optimistic Concurrency Control (OCC):

  1. Snapshot Isolation — Snapshot Isolation ensures that each transaction sees a consistent snapshot of the database at the start of the transaction. MVCC and timestamp ordering method help us achieve snapshot isolation.

2. MVCC — Multi-Version Concurrency Control maintains multiple versions of data and allows transactions to proceed without acquiring locks upfront. Example: In a banking system, multiple users can concurrently transfer funds between accounts without blocking each other. Each transaction operates on its own version of the account balances, ensuring consistency upon commit.

3. Timestamp Ordering — Assigns unique timestamps to transactions and enforces a total order of their execution. Example: In a distributed system, transactions for processing customer orders are assigned timestamps. The system ensures that the order processing follows the order of timestamps to prevent conflicts and maintain consistency.

4. CRDT (Conflict-Free Replicated Data Type) is a distributed data structure that enables concurrent updates in a distributed system without the need for centralized coordination or consensus algorithms. CRDTs are designed to handle conflicts that may arise when multiple users concurrently modify the same piece of data. One common use case for CRDTs is collaborative real-time editing applications, where multiple users can simultaneously edit a shared document.

Pessimistic Concurrency Control (PCC):

1. Two-Phase Locking (2PL) — Acquires locks on data resources upfront and releases them at the end of the transaction. Example: In a shared database, when a user wants to update a specific row of data, 2PL ensures that other users cannot access or modify the same row until the lock is released, preventing conflicts. Learn more about 2PL here.

2. Strict Two-Phase Locking (Strict 2PL) — A variant of 2PL where all locks acquired during a transaction are held until the transaction is committed or rolled back. Example: In a distributed database, a transaction locks all the necessary resources (e.g., tables, rows) at the beginning and holds the locks until the transaction is completed, ensuring no other transactions can access or modify the locked resources.

3. Multiple Granularity Locking — Allows acquiring locks at various levels of granularity, such as table level, page level, or row level. Example: In a database system, a transaction can acquire a lock at the row level for a specific record it wants to update, preventing other transactions from modifying the same record but allowing concurrent access to other records in the table.

4. Distributed Lock Manager (DLM) — A distributed file system provides access to files across multiple nodes in a network. A Distributed Lock Manager coordinates access to shared files to prevent conflicts. For example, in a distributed file storage system, the DLM ensures that only one client holds an exclusive lock on a file at a time to avoid data corruption or inconsistencies caused by concurrent modifications.

The choice between OCC and PCC depends on factors such as workload characteristics, contention level, and desired level of concurrency and performance. OCC is often favored when conflicts are expected to be infrequent, allowing for greater concurrency, while PCC is preferred when conflicts are anticipated to be frequent, at the cost of potentially more locking and blocking. For instance, an e-commerce solution may opt for OCC under normal conditions and choose to use PCC when there is a burst in demand for an item on sale i.e. a hot sku, or use PCC only when inventory for an item reaches a certain low threshold.

Must READ for Continuous Learning:

· System Design: https://bit.ly/3S05RGS

--

--