Database Transaction Isolation Levels

Published in

Geek Culture

10 min readJul 3, 2021

Isolation levels basically come into play for the “I” property in ACID, namely the Isolation.

Let’s have a quick recap of ACID properties.

A — Atomicity: It says that all instructions in a transaction should happen in an atomic manner. Atomicity simply means that it can not be broken down, so a transaction executes as if it’s a single step.
E.g. A wants to transfer money(Rs. 500) to B. Call it a Transaction.
Atomicity guarantees that A’s balance will get deducted by Rs. 500, and B’s balance will get credited by Rs. 500. So both this credit and deduction happen to be a single step.
Just imagine the chaos if it was not atomic e.g. A’s money got deducted and then Transaction failed, and B was never credited.

C — Consistency: This ensures that the database is always in a consistent or in valid state.
Using the same example as above, say before Transaction, A’s Balance was Rs. 1000 and B’s Balance was Rs. 2000.
After Transaction(of Rs. 500 from A → B), we expect A’s balance=500Rs and B’s balance=Rs.2500
If we think about it, both these states are valid. It must not happen that A’s balance got updated to Rs. 500 but B’s balance is other than Rs. 2500.
So here, the sum of their balance can be thought of as a state, and it should remain consistent, before and after Transaction.

I — Isolation: This property prevents issues if multiple transactions are running concurrently by ensuring these are not affected by each other. i.e. the result should be the same as the result obtained if the transactions were running sequentially.
To take an example, say there are 2 concurrent transactions going on:
A → B and A → C transfer of Rs. 500 in each transaction.
There can be multiple scenarios where concurrency might cause problems like T1 to read A’s balance as 1000, and also meanwhile T2 read A’s balance as 1000. Both deducted 500 to send to B and C respectively. This causes the problem as now both will update A’s balance as 500.
Since A’s balance is now Rs. 500, it contradicts Consistency, as A’s expected balance should be 0 after these 2 transactions.
We have several Isolation levels, depending on how strict we want these to be, and those are the main focus of this article.

D — Durability: It ensures that once the Transaction has committed, we should not lose its state, and must be persisted. E.g. A had Rs. 1000, and it transferred Rs. 500 to B. Now every next time we query A’s balance, we should get the latest value, and must not lose these details.

Let’s talk about the isolation levels and why are those required in the first place in a concurrent environment.

Problems in a concurrent environment:

1. Dirty Reads:

Dirty refers to wrong or unintentional data which may have never existed in DB.

Say we have 2 transactions T1 and T2, running concurrently.
Now if T1 inserts/updates some rows and T2 reads these rows before T1 has committed.
T2 did a Dirty read here as T1 may decide to rollback/abort, and never commits, So what T2 read never existed.

E.g.
A’s balance before any Transaction = Rs. 1000.
T1 begins
T1 Reads A‘s Balance=Rs. 1000
T1 set A’s Balance=Rs. 500 (maybe to send it to B, etc.)
T2 begins
T2 Read A’s Balance=Rs. 500 [DIRTY READ]
T1 Rollback.

Here, T2 read A’s Balance=Rs. 500, which is dirty read, as T1 transaction to send Rs. 500 from A → B was aborted, meanwhile, T2 read this wrong value of A’s balance.
Consider the chaos, if T2 was meant to send Rs. 800 to C. It will see that A only has Rs. 500 and will return “insufficient balance” error, even though in actual A has Rs. 1000 that time(as T1 never happened/Rolled-Back).

2. Dirty Writes:

Synonymous to Dirty reads, Dirty writes can happen when T1 is going on, and T2 writes some value. This means when the T1 commits, it will also commit T2's change as well, which T2 would have rolled back.
This will lead to unintentional write to the DB.

E.g.
T1 begins
T1 Reads A‘s Balance=Rs. 1000
T1 set A’s Balance=Rs. 500 (maybe to send it to B, etc.)
T2 begins
T2 Reads A‘s Balance=Rs. 500 ( as T1 has updated it already)
T2 set A’s Balance=Rs. 300 (maybe to send 200 to C, etc.) [DIRTY WRITE]
T1 Commit. (Commits A’s Balance = 300)
T2 Rollback. (Meaning A → C of Rs. 200 never happened).

So here, since A → C never happened, it was only A → B of Rs. 500, so expected A’s balance = 500, but due to Dirty Write, A’s balance is wrongly set to Rs. 300.

3. Non Repeatable Reads:

This can happen when a Transaction tries to read the DB row multiple times and gets different results each time e.g. if T1 reads a DB row at 2 different times, and in between these 2 reads, T2 updates the row.

Consider an example:
T1 begins
T1 Read A‘s Balance=Rs. 1000 [Read 1]
T2 begins
T2 Read A‘s Balance=Rs. 1000
T2 Write A‘s Balance=Rs. 500 (Say A → B Rs. 500)
T2Commit.
T1 Read A‘s Balance=Rs. 500. [Read 2] — Issue.

So as the name suggests, when a Transaction does repeatable reads, it gets different values.

4. Phantom Reads:

As the name suggests, it means some phantom/ghostly read is done. This can happen if T1 queries some range of rows(say N rows), and meanwhile, T2 inserts an extra row matching the same query conditions of T1.
Then if T1 searches again, it will get an additional row(phantom read).

E.g.
T1 begins
T1 queries: select * from Table where X>2 → Say it returns 100 rows.
T2 begins
T2 inserts a row with X=150
T1 searches the same query, and this time it will get 101 rows.

So as seen above, we can have the above types of concurrency issues, and there are 4 Isolation levels to handle these problems.

Before going into isolation levels, let’s understand locks on Databases.

Read(shared) Lock: If T1 holds a Read Lock on a row, then T2 can still read that row.
Meaning both T1 and T2 can read(shared lock) on the same row.
Also since T1 holds Read lock, and “Readers don’t block Writers” T2 can still update that row by acquiring the Write Lock.
Write(exclusive) Lock: If T1 holds a Write Lock on a row, then T2 can NOT read or write to that row. (Writers Block Readers).
Meaning if Write Lock is set on a Row, no other T can read/Write on that row.

Isolation Levels:

1. Read Uncommitted Isolation Level:

This provides 0% isolation, as it allows to read the uncommitted data too.
At such an Isolation level, all the above concurrency issues exist.

2. Read Committed Isolation Level:

It provides isolation to only allow the committed data to be read.
Let’s see which all problems can it solve.

Dirty Read: Solved.

T1 begins
T1 Reads A‘s Balance=Rs. 1000
T1 set A’s Balance=Rs. 500 (maybe to send it to B, etc.)
T2 begins
T2 Read A’s Balance= ? [BLOCKED] (Will not execute until T1 completes)

So since Transactions can only read committed data, it prevents Dirty reads.

How does this work?
When T1 Reads A’s balance, it acquires Shared/Read lock.
Then T1 Writes/Updates A’s balance, and it gets Write Lock.

Now when another Transaction tries to read the value held under Write lock, it won't be allowed, and it waits until the Write Lock is release or until T1 completes.

So once T1 Commits or Rollbacks, the write lock is released, and T2 will be unblocked and will read A’s balance=500 which is correct, and not Dirty.

Similarly, T2 cant update A’s balance, until T1 has committed/rolled back, so it also prevents Dirty Writes.

Non-Repeatable Reads? Still There.

T1 begins
T1 Reads A‘s Balance=Rs. 1000 [Read-1]
T2 begins
T2 Read A’s Balance= Rs. 1000 (Can read as Row= Read Locked)
(T1 reads and releases Read Lock ASAP, and the row is unlocked now)
T2 Update A’s Balance= Rs. 500 (Can Write as Write lock is free for this row)
T2 Commit. (Releases Write Lock, now row is free to Read/Write)
T1 Reads A‘s Balance=Rs. 500 [Read-2] [Different Result] [non-Repeatable]

So as shown, although T1 ensures it always reads the committed values, other Transactions can still update this row which means if T1 reads the same row again, it will get a different result.

How it works:
T1 gets Read Lock, reads, and releases the Read lock asap once the read is done.
T2 can execute and update the row as it is unlocked now.
So When T1 tries to re-read the same row, it will get a different result.

Phantom Reads? Still There.

T1 begins
T1Queries select * from Tbl where X>100 → 3 rows
T2 begins
T2 Inserts 1 additional row with X=150.
T2 Commits.
T1 Queries select * from Tbl where X>100 → 4rows [Phantom Read].

So as shown, although T1 ensures it always reads the committed values, other Transactions can insert other rows which might affect the reads counts of T1.

How it works:
T1 again takes a Write Lock on the Rows it’s updating.
However as explained above, Other transactions can still update other rows which are unlocked.

3. Repeatable Reads Isolation Level:

This adds another layer of isolation on top of read-Committed, to further prevent the repeatable-reads problem.
This is achieved by the “Readers can block Writers” principle, which is against the general Read Lock behavior.

How it works:
As discussed above the Read-Committed Isolation level, T1 gets Read Lock and releases it asap once the read is done, and then T2 can come and acquire Write lock to update it.
What if the Read lock held by T1 is not released asap, and it also prevents Other Transactions to acquire Write lock?
Then while T1 is reading(any number of times) no other transaction can update this row, hence preventing the non-repeatable read problem.

T1 begins
T1 Reads A‘s Balance=Rs. 1000 [Read-1] {R Lock held on this row}
T2 begins
T2 Read A’s Balance= Rs. 1000 (Can read as Row= Read Locked)
T2 Update A’s Balance= Rs. 500 [BLOCKED][WaitingforLock]
T1 Reads A‘s Balance=Rs. 1000 [Read-2] [Same Result]

Please note that,
T1 Read → Gets Read lock.
T2 Reads → Allowed, as Read = shared lock. Multiple reads allowed.
T2 Writes → Not Allowed here, as this row is Locked, and T2 will be held under X-WAIT status.
X means Write(or mutual Exclusive Lock) — Wait means waiting for Write Lock.
So T2 cant acquire a lock to update this row, and hence when T1 reads the same row again, it will still get the same result.
Once T1 Completes, the lock is released, and T2 can then get this lock to update.

Phantom Reads? Still There.
It does not allow any updates on the rows it is interested in. However it does not prevent any phantom reads, other Transactions can still insert new rows.

T1 begins
T1Queries select * from Tbl where X>100 → 3 rows (R Lock held on 3 rows)
T2 begins
T2 Inserts 1 additional row with X=150
T2 Commits.
T1 Queries select * from Tbl where X>100 → 4rows [Phantom Read].

So as per above, although T2 can not update any of the rows queried by T1, since it can insert new rows, the problem of phantom reads still exists.

4. Serializable Reads Isolation Level:

This is the most strict level of isolation, where the problem of phantom reads is also prevented.

How it works:
When T1 queries a range or records, it acquires a different type of lock, signifying that it belongs to the range.
This lock is called Range-lock, (Range S-S is its status) instead of S for Read lock and X for the Write lock.
So when T1 queries a range, all the rows are range-locked.
If T2 tries to insert a new row, which might affect this range, then T2 will be blocked until T1 completes and releases the Range lock.
However, T2 can read the rows, as Range-lock allowed shared reads, but prevents certain Writes.

T1 begins
T1Queries select * from Tbl where X>100 → 100 rows
(Range S-S Lock held on 3 rows)
T2 begins
T2 Inserts 1 additional row with X=150 [Blocked]
T1 Queries select * from Tbl where X>100 → 100 rows [Same result].

So as we can infer from this, it solves all the above concurrency problems.