Conflicts and Concurrency in Distributed Systems — Part 3

Ravish Goel
4 min readApr 6, 2023

--

Write Skews and Phantoms

In the previous article, we discussed transactions, concurrency in transactions, weak isolation (read committed and snapshot isolation) and its race conditions (lost updates). In this article we will see another race condition that can happen with snapshot isolation called write skew.

Consider the following scenario:

  1. There is a meeting room (single room) booking system (with Snapshot Isolation guarantee). Two meetings cannot overlap.
  2. Andy wants to book the meeting room between 12 noon and 1 PM on a particular date. Brian also wants to book the same meeting room between 12:30PM and 1:30PM on the same date. Please note that the two meetings are overlapping.
  3. Andy’s transaction begins at 10ms. Andy checks for any rows (meeting id, start time, end time etc.) that match with his query (tell me if a meeting is already happening within the timeslot that I prefer). If the database doesn’t return anything then it is clear that the meeting room is free. And so it happens! The slot is available. This read completes by 30ms.
  4. Brian’s transaction, same as step 3 above, begins at 20ms. Brian also sees that the room is vacant. Brian’s read completes by 40ms.
  5. Nothing is returned from the database in both the cases. You cannot attach a lock on a record that hasn’t been created yet. Database also will not detect any conflict in the future automatically, because there is no data that could be identified as stale at some later point in time. Snapshot isolation with automatic detection of lost updates (our most advanced weapon) is thus useless!
  6. Based on the results received, their write requests start processing. Andy’s request begins at 50ms and completes by 60ms. A new record is created in the database. Brian’s request begins at 60ms and finishes by 70ms. Brian’s request also creates a new record. And this creates two conflicting meetings. Both of their transactions commit successfully at some later point in time. This is a write skew.

Now, let’s look at some interesting outcomes. There are three techniques that can help with the detection of this type of conflict:

  1. Serial Execution
  2. 2 Phase Locking
  3. Serializable Snapshot Isolation

Let’s discuss those in detail.

Serial Execution

Execute one transaction at a time. This could seriously hamper the database’s performance even if there are 1000 nodes (running the same copy of the database) because you are limiting the throughput of the entire database to just one CPU core. A simple solution is to partition the data (each partition can be replicated though) so that each CPU core is given its own partition. That will scale your throughput linearly with the number of cores.

2 Phase Locking

It makes the lock requirements much stronger. Writers not only just block other writers, they also block the readers (by using exclusive locking). Readers cannot block other readers though (they can use the lock in shared mode). This can be implemented using two techniques.

  1. Predicate Locks
  2. Index-Range Locks

Predicate Locks can be applied to the records that do not yet exist, but which might be created in the future. Such records are called Phantoms. In the context of the example above, both Andy and Brian locked the phantom rows (the bookings that haven’t happened yet) using the predicate lock in shared mode. Now, Andy’s write will not take place (it has to wait) because it matches with the phantom(s) locked by Brian’s predicate lock. You can also say that Andy will not be able to acquire exclusive lock on the phantom(s) because of the existing shared lock on the same phantom(s)(by Brian). To me it is a deadlock! By the way, deadlocks occur very frequently under 2PL. One has to abort!

Index-Range Locks are based on approximation. In the context of the example above, it means locking all the meetings for the room entirely by attaching a shared lock to its index after the read is successful. Something tangible! This lock is shared between Andy and Brian indicating, “Both Andy and Brian have searched this meeting room”. In this case also, Andy’s write will not take place (it has to wait) because while doing so it encounters the shared lock on the index held by Brian. Again a deadlock!

Serializable Snapshot Isolation

Enough of the deadlocks! Andy’s booking has to happen. In serializable snapshot isolation, instead of blocking, transactions continue. But if transaction manager detects that something bad had already happened before committing the transaction, then the transaction has to be aborted and retried. In other words, transactions cannot act on an outdated premise. How does it happen!

  1. Detecting writes that affect prior reads: In the context of the example above, Andy’s write will not block. It will happen. But Andy’s write will outdate Brian’s premise (there is no overlapping booking). Brian’s transaction has no other option rather than to be aborted and retried.
  2. Detecting stale MVCC reads: In an MVCC database, when a transaction starts, it reads from a consistent snapshot. Then, it ignores the writes that are made by any other transactions that aren’t yet committed. Let’s make a small adjustment to the timeline in our example above. Brian’s transaction begins after Andy’s transaction has created the booking in the system but not yet committed. Then at some point, Andy’s transaction is committed (Brian’s transaction is not yet committed though). Brian’s write also executes without any failure but when his transaction is about to commit, the transaction manager detects that his premise is already outdated. And so, Brian’s transaction is aborted and retried.

This is it guys! Andy has successfully booked the meeting room.

Hope you enjoyed this article. Till then!

--

--