System Design: Database Transactions, Isolation Levels, Concurrency Control Cont’d. (Part 2)

Pulkit Gupta
9 min readApr 24, 2020

--

This is my 5th blog in the system design series on which I am currently working.

So this blog is in continuation of my previous blog on Database Transactions & ACID compliance.

If you haven’t read my previous blog on this topic then I will highly recommend reading it here.

So in this blog, we are going to learn the below-mentioned topics in-depth.

  • Some Recap about Read Committed Isolation.
  • What is the Repeatable Read? What are the issues with it?
  • Algorithm to implement Repeatable Read
  • What is the Serializable Isolation level? What are the issues with it?
  • What is Optimistic & Pessimistic concurrency control?
  • Two-phase commit (2PL) mechanism

So just before starting, I would like to add that this blog is going to be a little bit longer. So please bear with me till the end. But yes I can assure that it’s worth reading it.

ISOLATION LEVELS

As we discussed earlier there is a total of 4 Isolation levels.

  • Read Uncommitted (Weakest)
  • Read Committed
  • Snapshot Isolation or Repeatable Read (Default level)
  • Serializable (Strongest)

I am not going to discuss the Read Uncommitted isolation level because it’s not used anywhere.

Caution: Folks can skip this part and jump directly to the Repeatable Read Isolation level if they have already went through Read Committed Isolation level in my previous blog here.

Let’s recap :P

Read Committed (Recap)

Now let’s recap with Read Committed isolation. The most basic level of transaction isolation is read committed.

So this level provides two guarantees:

  • When reading from the database, the transaction can only see data that has been committed (no dirty reads).
  • When writing to the database, the transaction can only overwrite data that has been committed (no dirty writes).

So in short if two transactions named T1 & T2 are getting executed, then Transaction T1 can see or read the changes (writes/updates) made by transaction T2 if T2 has been committed before T1.

So what’s a Dirty Read & a Dirty Write?

A dirty read occurs when a transaction is allowed to read data from a row that has been modified by another running transaction and not yet committed.
A dirty write occurs when a transaction is allowed to overwrite data from a row that has been modified by another running transaction and not yet committed.

Implementing Read Committed

Read committed was a very popular isolation level. In earlier versions of some databases like Oracle 11g, SQL Server 2012, MemSQL was the default setting. You can check it here for Oracle 11g & here for MemSQL.

Skipping some more details of implementation to keep the length of this blog maintained. You can check it here.

Issues in Read Committed

Read Committed Isolation suffers from the problem of Non-Repeatable read or Read skew.

Now let’s understand the issue with a real-time scenario.

  • Let’s say there is a customer named Alice who has a total of $1000 bucks in her two accounts at a bank and the amount is split across those two accounts with $500 each.
  • Now Alice initiated a Transfer-Transaction $100 from one account to the other. Now this transaction consists of two steps first is to add $100 to Account 1 and the second is to deduct $100 from Account 2.
  • But before initiating a transfer transaction. Alice also initiated a Read-Transaction (just to verify the balances as we all do :P) which will first read the balance from Account 1 and then read the balance from Account 2.
  • As we can see in the above image, Read-Transaction was initiated before the Transfer-Transaction, and also Read-Transaction was committed after Transfer-Transaction.
  • Now according to the rule of Read Committed Isolation as Transfer-Transaction was committed before Read-Transaction the Read-Transaction can see changes done by Transfer-Transaction.
  • So she may see that balance of Account 1 before the incoming payment has arrived (with a balance of $500), and see the balance of another account after the outgoing transfer has been made (the new balance is $400).
  • So to Alice, it now appears as she only has a total of $900 in her accounts ($500 in 1st read of Account 1 & $400 in 2nd read of Account 2). It seems that $100 has vanished into thin air.
  • This anomaly is called Non-Repeatable Read. This temporary inconsistency can’t be tolerated in some situations like backups and analytics.

So what’s a Non-Repeatable Read?

Non-Repeatable occurs when one transaction reads the same data twice while another transaction updates that data between the first and second read of the first transaction.
Even you can guess by its name as we are repeating the read but not getting the same within a transaction (Non+Repeatable+Read).

Also, I personally believe that in computer science with the name you can guess the definition of almost every term.

Snapshot Isolation & Repeatable Read

In Snapshot Isolation, each transaction reads from a consistent snapshot of the database that is, the transaction sees all the data that was committed in the database at the start of the transaction.
Even if the data is subsequently changed by another transaction, each transaction sees only the old data from that particular point in time.

Snapshot isolation is a boon for long-running, read-only queries such as backups and analytics. It’s better than Read Committed Isolation level in terms of avoiding concurrency issues.

No baby you are not going to get blocked in SnapShot Isolation :P

The key principle of snapshot isolation is Readers never block Writers, and Writers never block Readers. This allows a database to handle long-running read queries on a consistent snapshot at the same time as processing writes normally, without any lock contention between the two.

Implementing snapshot isolation

The database has to keep several different committed versions of objects (row, or tables, or documents), because various in-progress transactions may need to see the state of the database at different points in time. Because it maintains several versions of an object (row, or tables, or documents) side by side, this technique is known as Multi-Version Concurrency Control (MVCC).

In PostgreSQL when a transaction is started, it is given a unique transaction ID. Whenever a transaction writes anything to the database, the data it writes is tagged with the transaction ID of the writer. When a transaction reads from the database, transaction IDs are used to decide which objects it can see and which it can’t see.

So in layman terms, your transaction T1 can only see (read/write) those objects created/updated by all other transactions that have been committed before starting of T1.

So by using this approach of snapshots for every transaction, we solved the problem of Non-Repeatable Read that we faced above with Alice (in Issues of Read Committed Isolation)

Our confused Morty

Repeatable Read & naming confusion

In Oracle, Implementation of Repeatable Read is called Serializable, and in PostgreSQL and MySQL it is called Repeatable Read.

For PostgreSQL, you can check it here & for MySQL, you can it here.

The reason for this naming confusion is that the SQL standard doesn’t have the concept of snapshot isolation, because the standard is based on System R’s 1975 definition of isolation levels and snapshot isolation hadn’t yet been invented then.

Let me tell you an incident when I just started my journey in this IT world. So from college days, I was interested in this System Design stuff. So back then when I joined an MNC as a Software Engineer I was assigned to a project in the Energy & Petrochemical domain which was using the famous MarkLogic as NoSQL. And they had a separate team of experienced developers for MarkLogic.

So one day that team was giving knowledge transfer (KT) session to us (new joiners) and as a newbie, I was very curious to ask each & every doubt. Now one person from that team said that “ Our MarkLogic is 100% acid compliant ” then I asked, “ Which Isolation Level are they currently using?

But sadly I didn’t get any satisfactory response. After that when I reached home back and I researched this topic and got to know that even MarkLogic also uses the MVCC technique to implement Snapshot Isolation. You can read more about it here.

So as an experienced or a senior developer I personally feel that everyone should be aware of what Isolation level your database is using in the project.

Even this answers one of your questions that I asked in my previous blog and that question was:

Why do some people say NoSQL is faster than Relational Database?

So that’s because Relational Databases in their early days were using only Serializable Isolation LeveL. And now these NoSQL databases have a trade-off between Concurrency Safety & Performance by using the Snapshot Isolation level.
Although, this is not a silver bullet point for NoSQL to perform faster. You will get to know about more reasons when I will write a blog for that too.

So as a consultant I would suggest you check which Isolation level a NoSQL database supports before choosing it for your project because if it doesn’t support serializable isolation level then I don’t think it’s 100% ACID compliant.

Issues in Snapshot Isolation

Snapshot Isolation suffers from the problem of Write Skew Anomaly.

let’s understand this by a simple example of a Meeting room booking system. So as the product owner of this system you want to enforce that there cannot be two bookings for the same meeting room at the same time. When someone wants to make a booking, you first check for any conflicting bookings (i.e., bookings for the same room with an overlapping time range), and if none are found, you create the meeting.

A meeting room booking system tries to avoid double-booking (not safe under snapshot isolation)

BEGIN TRANSACTION;

/*Check for any existing bookings that overlap with period of 10am-11pm*/

SELECT COUNT(*) FROM bookings_table WHERE room_id = 897 AND meeting_end_time > ‘2020–04–20 10:00’ AND start_time < ‘2020–04–20 11:00’;

/*If the previous query returned zero*/

INSERT INTO bookings (room_id, start_time, end_time, user_id) VALUES (123, ‘2015–01–01 12:00’, ‘2015–01–01 13:00’, 666);

COMMIT;

Now, what if another employee fires the same query at the same time with the same constraints. Unfortunately, Snapshot Isolation does not prevent another user from concurrently inserting a conflicting meeting. This race condition is known as Write Skew Anomaly.

To guarantee you won’t get scheduling conflicts, you need Serializable Isolation. Which is the strongest Isolation level till now.

So what’s a Write Skew Anomaly?

So when Two transactions (T1 and T2) concurrently read an overlapping data set (e.g. values V1 and V2), concurrently make disjoint updates (e.g. T1 updates V1, T2 updates V2), and finally, concurrently commit, neither having seen the update performed by the other, then Write Skew Anomaly occurs.

This again answers another question that I asked in the previous blog

Have you ever wondered why in the Banking domain, NoSQL databases aren’t recommended?

As most NoSQL only supports at max Snapshot or Repeatable Read Isolation we can’t afford any kind of race conditions in our database.

Now again as I also have to maintain the length of this blog. So, I have written the remaining part here which discusses the most important Serializable Isolation & Algorithm to implement it.

Moreover, if you have any doubts you can connect with me on Linkedin, Github, Twitter, Facebook. And if you liked this article then please share and follow.

--

--