Understanding and Mitigating Common Concurrency Issues in Databases

Published in

The Developer’s Diary

7 min readMay 19, 2024

In our previous articles, we explored the basics of transactions, Read Committed Isolation, and Snapshot Isolation. These isolation levels are essential for maintaining data integrity in a multi-transaction environment. In this article, we will delve into common concurrency issues that can arise and how different isolation levels address these challenges.

Introduction

Concurrency issues occur when multiple transactions interact with the same data simultaneously, leading to potential inconsistencies and data integrity problems. Understanding these issues is crucial for designing robust database systems. Here, we will discuss some of the most common concurrency issues: lost updates, write skew, and phantom reads.

Preventing Lost Updates

So far, we have discussed guarantees about what read-only transactions can see in the presence of concurrent writes. But what happens when two transactions write concurrently? One type of issue that arises is called a write-write conflict, specifically a lost update.

Lost Updates

A lost update occurs when an application reads a value from the database, modifies it, and writes it back — a read-modify-write cycle. If two transactions perform this cycle concurrently, one of the modifications can be lost. This can happen in various scenarios:

Incrementing Counters or Updating Account Balances: When transactions read an initial value, modify it (e.g., incrementing a counter or adjusting a balance), and write it back. If two transactions do this at the same time, one update might overwrite the other.
User Editing the Same Document: When multiple users edit the same document concurrently, the changes made by one user might be overwritten by another user’s changes, leading to data loss.

This is a common problem, and there are various solutions to it. One such solution is atomic write operations.

Atomic Write Operations

An atomic write operation is a simple and effective way to prevent lost updates by ensuring that a write operation is completed as a single, indivisible unit of work. Here’s how it works:

Single Write Action: Instead of reading, modifying, and then writing data in separate steps, an atomic write operation combines these steps into a single action. This ensures that the update is applied entirely or not at all, preventing concurrent transactions from interfering with each other.
Example: Consider a counter that needs to be incremented. Instead of:

Reading the current value.
Incrementing the value in the application.
Writing the new value back to the database.

Use an atomic increment operation:

UPDATE counters SET value = value + 1 WHERE id = 1;

This SQL statement ensures that the value is incremented in one atomic step, preventing other transactions from reading the value in an intermediate state.

Downside: The main downside of atomic write operations is that they are not suitable for all types of updates. For complex modifications that involve multiple steps or conditional logic, atomic write operations might not be feasible. Additionally, not all databases support atomic operations for all types of data modifications, limiting their applicability.

Explicit Locking

Explicit locking involves manually acquiring locks on data before performing read-modify-write operations to ensure that no other transaction can modify the same data concurrently. Here’s how it works:

Acquire Locks: Before a transaction reads and modifies data, it explicitly acquires a lock on the data. This lock prevents other transactions from reading or writing the data until the lock is released.

BEGIN TRANSACTION;
SELECT * FROM inventory WHERE product_id = 1 FOR UPDATE;

2. Perform Operations: The transaction performs the read-modify-write cycle while holding the lock.

UPDATE inventory SET stock = stock - 10 WHERE product_id = 1;

3. Release Locks: Once the transaction completes the modification, it releases the lock, allowing other transactions to access the data.

COMMIT;

In this example, the SELECT ... FOR UPDATE statement acquires a lock on the row for the specified product. This lock ensures that no other transaction can modify the stock level of the same product until the transaction releases the lock by committing.

Downsides

Reduced Concurrency: Explicit locking can reduce the level of concurrency in the database because transactions must wait for locks to be released before they can proceed.
Deadlocks: If transactions acquire locks in different orders, they can end up waiting for each other to release locks, leading to a deadlock. Deadlock detection and resolution mechanisms are necessary to handle this situation.

Compare and Set (CAS)

Compare and Set (CAS) is an atomic operation that helps prevent lost updates by ensuring that a value is updated only if it has not been modified by another transaction since it was last read. CAS is widely used in distributed systems and concurrent programming to handle conflicts efficiently.

How Compare and Set Works

Read Current Value: The transaction reads the current value of the data item.

SELECT stock FROM inventory WHERE product_id = 1;

2. Modify Value: The transaction computes the new value based on the current value.

-- Assuming the current stock is 100
-- and the transaction wants to decrement it by 10
SET new_stock = 100 - 10;

3. Compare and Set: The transaction attempts to update the data item using an atomic compare and set operation. The update succeeds only if the current value has not changed since it was last read by the transaction.

UPDATE inventory
SET stock = new_stock
WHERE product_id = 1 AND stock = 100;

In this example, the UPDATE statement includes a condition (AND stock = 100) to ensure that the update is applied only if the stock has not changed since it was last read. If another transaction has modified the stock in the meantime, the condition fails, and the update does not occur.

Downsides

Retry Logic: If the CAS operation fails due to a conflict, the transaction must retry the operation, which can lead to increased complexity in the application logic.
Limited to Simple Updates: CAS is suitable for simple updates but can be challenging to apply in complex transactions that involve multiple data items or intricate business logic.

Write Skew

Write skew is a concurrency anomaly that can occur in databases under certain isolation levels. Unlike lost updates, which involve conflicts over the same piece of data, write skew involves transactions that read overlapping sets of data and then make conflicting updates based on those reads. Write skew can lead to inconsistencies in the database, even if each individual transaction appears correct when viewed in isolation.

Example of Write Skew

Consider a hospital database where two doctors need to ensure that at least one doctor is on call at all times. The doctors’ schedules are stored in a table:

CREATE TABLE doctors_on_call (
    doctor_id INT PRIMARY KEY,
    on_call BOOLEAN
);

Assume there are two doctors, A and B, both initially on call:

INSERT INTO doctors_on_call (doctor_id, on_call) VALUES (1, TRUE), (2, TRUE);sql

Two transactions are executed to update the on-call status:

Transaction 1: Doctor A decides to go off call.

BEGIN;
SELECT on_call FROM doctors_on_call WHERE doctor_id = 2;  -- Sees Doctor B is on call
UPDATE doctors_on_call SET on_call = FALSE WHERE doctor_id = 1;
COMMIT;

Transaction 2: Doctor B decides to go off call.

BEGIN;
SELECT on_call FROM doctors_on_call WHERE doctor_id = 1;  -- Sees Doctor A is on call
UPDATE doctors_on_call SET on_call = FALSE WHERE doctor_id = 2;
COMMIT;

Each transaction reads the current status of the other doctor and sees that they are on call. Both transactions then decide to go off call, leading to a situation where no doctor is on call, violating the intended constraint.

Preventing Write Skew

To prevent write skew, stronger isolation levels or additional mechanisms are needed:

Serializable Isolation: The strictest isolation level, Serializable, ensures that transactions are executed in a way that produces the same effect as if they were executed serially. This prevents write skew by ensuring that no two transactions can make conflicting decisions based on the same initial state.
Explicit Locks: Using explicit locks on the data being read and written can prevent write skew by ensuring that transactions do not concurrently read and update overlapping sets of data.
Application-Level Checks: Adding application-level checks can help prevent write skew by enforcing constraints before committing transactions.

Phantom Reads

Phantom reads occur when a transaction reads a set of rows that satisfy certain conditions and, during the transaction, another transaction inserts or deletes rows that affect the result set. This leads to inconsistencies when the initial transaction re-executes the query and sees different results.

Example of Phantom Reads

Consider a transaction that retrieves all orders above a certain amount:

BEGIN;
SELECT * FROM orders WHERE amount > 100;

While this transaction is still in progress, another transaction inserts a new order with an amount of 150:

BEGIN;
INSERT INTO orders (order_id, amount) VALUES (101, 150);
COMMIT;

If the initial transaction re-executes the query, it will see the newly inserted order, leading to a phantom read.

Comparing Write Skew and Phantom Reads

Write Skew: Occurs when two transactions read overlapping data and make conflicting updates based on their initial reads, leading to an inconsistent state.
Phantom Reads: Occur when a transaction reads a set of rows that satisfy a condition, and another transaction modifies the data set by inserting or deleting rows that change the query results, leading to inconsistencies.

Both write skew and phantom reads are concurrency issues that can lead to data inconsistencies, but they involve different scenarios and require different approaches to prevent. Stronger isolation levels, such as Serializable, can help mitigate these issues by ensuring a more consistent and isolated view of the data for each transaction.

Conclusion

Concurrency issues such as lost updates, write skew, and phantom reads pose significant challenges in maintaining data integrity in a multi-transaction environment. By understanding these issues and implementing appropriate solutions, such as atomic write operations, explicit locking, compare and set, and stronger isolation levels, we can ensure robust and reliable database operations.

In the next article, we will delve into serializability, the strongest isolation level, and explore how it guarantees the highest level of data consistency by ensuring transactions execute as if they were run sequentially. Stay tuned!

Ref: This article was inspired by the book “Designing Data-Intensive Applications” by Martin Kleppmann, which provides in-depth insights into the complexities of database design and transaction management.

Understanding and Mitigating Common Concurrency Issues in Databases

Introduction

Preventing Lost Updates

Lost Updates

Atomic Write Operations

Explicit Locking

Downsides

Compare and Set (CAS)

How Compare and Set Works

Downsides

Write Skew

Example of Write Skew

Preventing Write Skew

Phantom Reads

Example of Phantom Reads

Comparing Write Skew and Phantom Reads

Conclusion

Written by Priya Patidar