Balancing Concurrency: Exploring Optimistic vs. Pessimistic Concurrency Control in Application Development

Published in

Fission Labs

8 min readMay 30, 2023

In the rapidly evolving landscape of AI-driven application development, developers often overlook the importance of data integrity, particularly in terms of concurrent control. This oversight can arise from the fast-paced nature of development, insufficient knowledge of proper application development practices, or excessive reliance on AI systems that may lack a comprehensive understanding of the use case context.

Concurrency control indeed poses a substantial risk to data integrity as it arises when multiple users concurrently access and update the same data. Unfortunately, many engineers tend to rely on the default concurrency controls offered by the database without addressing the potential issue of the lost update anomaly and write skew. While this might be acceptable for non-critical data in many applications, it becomes crucial to handle this anomaly with care in critical applications.

Concurrency controls are implemented at the database or at the application level to make sure conflicts are handled. Before delving into concurrency controls, it is essential to understand the concept of the lost update anomaly.

The Lost Update Anomaly

The lost update anomaly occurs when multiple transactions or users concurrently attempt to modify the same data, but their changes conflict with each other, resulting in the loss of some updates.

In a scenario where two users are attempting to update the same data simultaneously through a web application:

User1 wants to change the name of the organization from “Fission” to “Fission Labs”.
User2 wants to change the name of the organization from “Fission” to “Fission Computer Labs”.

If both users initiate their updates at the same time using separate HTTP requests, both updates will be processed concurrently.
In this situation, both requests will initially retrieve the data associated with ID 1. Subsequently, each request will independently attempt to update the value in its own transaction without any awareness of the other transactions taking place.
The final value of the data can either be “Fission Labs” or “Fission Computer Labs” depending on which update statement is executed last.
In the given scenario, the update statement from transaction 2, which changes the value to “Fission Computer Labs”, is lost, and we can only see the value “Fission Labs” as it was the value from the update statement executed last.

This situation is known as a lost update anomaly, where one update overrides the changes made by another update. While this may be acceptable in non-critical transactions, it should be handled carefully in critical applications to ensure data consistency and integrity.

Let’s consider another example where the lack of proper concurrency control can result in data corruption and unintended consequences.

For instance, in an e-commerce application, suppose two users make a purchase of the same item simultaneously, and the system needs to decrement the inventory count by one.

Both User1 and User2 read the current data for item ID 1, which has an initial value of 10.
Independently and without awareness of each other, both users attempt to reduce the item count by one and update the value.
Since these actions occur concurrently without proper concurrency controls, the final value becomes inconsistent.
In this particular case, the final value ends up being 9, whereas it should actually be 8.

Despite the first transaction reading the data using a select operation, it does not prevent the second transaction from also reading and updating the same value. This can result in a data anomaly, where the final data state is inconsistent. This behaviour is observed in MySQL when using any isolation level that is less strict than “repeatable read,” including the “repeatable read” level itself. It’s worth noting that “repeatable read” is the default isolation level for MySQL transactions, meaning that many applications may experience this issue if the default settings are not altered.

Pessimistic Concurrency control

Pessimistic concurrency control is employed in situations where the probability of multiple transactions occurring concurrently on the same resource is significant. Typically, this approach utilizes lock mechanisms to manage concurrency effectively.

By Using DB Transaction Isolation level (Serializable)

When employing the highest transaction isolation level, known as “Serializable,” concurrent updates are prevented, ensuring that transactions occur sequentially without conflicts. However, a potential downside of using this level of isolation is an increased likelihood of deadlocks occurring between concurrent transactions.

In the above diagram, both User1 and User2 will acquire the shared read lock while reading the data in their own transactions. They need to acquire the exclusive write lock for updating the data which they can’t do because shared read locks prevent exclusive write locks, so neither of them can change it until one of them releases the shared read lock.

In this use case,

When User2 executes an update statement, that will be blocked until it acquires an exclusive write lock.
When User1 executes an update statement which results in deadlock error and it releases the shared read lock.
Then User2 will be able to acquire an exclusive write lock and the update will be executed. In this particular case, the final value will be update to 9 as expected.

Each relational database management system (RDBMS) offers its own transaction isolation levels to address concurrency issues. It is advisable to consult the documentation specific to the RDBMS you are using to understand the available isolation levels and their implications.

While the RDBMS manages the low-level aspects of concurrency control, it is the responsibility of the application developer to catch and handle deadlock errors appropriately. By implementing a well-designed retry mechanism, the application can gracefully handle deadlock situations by retrying the transaction after a certain interval, thus increasing the chances of successful execution.

By Using SELECT … FOR UPDATE

This SQL statement locks the rows that are returned with the select statement. “SELECT .. FOR UPDATE” only works inside a transaction and won't work outside of it. If it’s not used inside a transaction then SELECT … FOR UPDATE works in auto-commit mode and won’t block others.

Transactions are initiated by both users concurrently, but only one user can acquire the lock on the selected rows. As a result, the other user must wait until the first transaction either commits or rolls back before proceeding.

In our scenario, User2’s select statement is blocked until User1’s transaction is either committed or rolled back. Once User1’s transaction is committed, User2’s select statement retrieves the updated value of 9 instead of 10, effectively eliminating the lost update anomaly.

When employing this type of pessimistic concurrency control, it is important to avoid queries that involve range scans or full table scans, as they would result in locking all the scanned rows. To mitigate such side effects, it is advisable to include an indexed column in your WHERE clause, which helps prevent table locking and other potential issues.

Optimistic concurrency control

Optimistic concurrency control operates under the assumption of a low likelihood of concurrent transactions on a single resource. In this approach, the cost associated with acquiring locks is considered to be significantly higher than the cost of retrying transactions. Rather than relying on locks, optimistic concurrency control allows conflicts to arise at commit time and provides mechanisms to handle them.

One common technique employed in optimistic concurrency control is to introduce a custom column in the resource table. This column is incremented each time an UPDATE or DELETE statement is executed on the resource. When performing subsequent operations on the resource, this incremented column value is utilized in the WHERE clause of the statements.

By using this approach, conflicts are detected at the time of committing the transaction, based on the comparison of the expected and actual values of the custom column. If a conflict is detected, appropriate actions can be taken to resolve it, such as rolling back the transaction or retrying it with updated values.

Both users retrieve the values of num_of_items and version from the database using a SELECT query. When updating the values, both users increment the version column and attempt to execute the update query. Since the update operation is atomic, only one of the two update queries will succeed, and the other UPDATE PreparedStatement will return a value of 0, indicating that no record was changed. In such cases, it is necessary to retry the flow from SELECT again.

It’s important to note that optimistic concurrency control is an application practice rather than a built-in feature of the database. The logic for concurrency control is implemented within the domain entities and application logic. The column used for optimistic concurrency can be a simple integer value or a date value, depending on the structure of the application.

By incorporating optimistic concurrency control into your application, you can effectively manage concurrent updates and handle conflicts in a controlled manner, ensuring data integrity and consistency.

Conclusion

Pessimistic concurrency control operates at the database level, requiring an active database connection to manage locks and transactions effectively. In contrast, optimistic concurrency control does not rely on an active database connection and can function seamlessly across multiple transactions.

Pessimistic concurrency control, particularly using the “SELECT FOR UPDATE” approach, is employed in highly concurrent scenarios where the cost of rolling back transactions outweighs the cost of acquiring locks.

Optimistic concurrency is the preferred option when your application logic requires concurrency controls across multiple databases or when having a dedicated database connection to manage locks or transactions is not feasible. In such cases, optimistic concurrency provides a suitable approach to handle concurrency by allowing multiple transactions to proceed concurrently and handling conflicts at the commit stage.

When dealing with the need to lock a range of rows, two options are available. One is to utilize a sophisticated form of optimistic locking, enabling concurrent execution of multiple transactions. The other option is to employ pessimistic concurrency control using locks, which explicitly prevents concurrent access to the locked rows. However, it’s crucial to consider that without properly indexed columns, using locks in this way may lead to locking the entire table and impacting performance.

When selecting a concurrency mechanism, careful analysis of application requirements and performance metrics is crucial. Consider factors like concurrency levels, conflict frequency, and long-term implications. Assess the impact on application logic and the potential need for code modifications. Also, consider maintainability, scalability, and compatibility with future changes. A comprehensive analysis will inform the decision and help choose the most suitable concurrency mechanism.