A-Z of Database Transactions (Part-2)

Sandeep Verma
6 min readFeb 24, 2019

--

Isolation level decides database’s behaviour during concurrent transactions

In Part-1 on of this series, I have explained in detail the definitions of Atomicity, Consistency, Isolation, and Durability guarantees of A.C.I.D

In Part-3 I have elucidated issues with concurrent writes and how databases handle those anomalies

Isolation Levels

Read Uncommitted

This is weakest level of isolation level. If two concurrent clients are reading/modifying the same data (A row in RDBMS or JSON document in Document databases), then the client is not blocked from a dirty read.

In figure-1, ‘Client-1’ attempts to update the value of row with ‘user_id’ =123. He credits ₹ 10000 to his balance of ₹ 20000. But later his transaction was rolled back (Due to process crashed or some other constraint violation).

‘Client-2’ read this dirty value(A write which is not yet committed by ‘Client-1’) and reports ₹ 30000 balance in the user’s account. If he takes any other decision based on this dirty read later in the same transaction, he will make the user row to go in some inconsistent state. Though Read Uncommitted isolation level allows dirty reads, it provides the highest level of performance when compared to other isolation levels as the database doesn’t need to remember the last commit or maintain different versions of the commits of the same object(MVCC)

Read Uncommitted prevents dirty writes (Figure-2) by taking a lock on the row which it is modifying. Any other transaction trying to modify the same row is forced to wait till this transaction either commits or rolled back.

  • SQL Server Supports Read Uncommitted Isolation level
Figure 1. Dirty Read. When a transaction read a write by another transaction which is still not committed
Figure 2. Preventing Dirty Write. The transaction from client-1 has taken write lock and forced the transaction to wait till it gets committed

Read Commited

This isolation level is better than “Read Uncommitted” as it also averts dirty reads. Read Committed isolation level render two guarantees :

  • Only committed data is read by a transaction (No dirty reads)
  • Only committed data is overwritten by a transaction (No dirty writes)

‘Read Committed’ is the default isolation level in the following databases :

  • Oracle 11g, PostgreSQL, SQL Server 2012 and MemSQL

These Databases prevent dirty reads by remembering old committed value corresponding to every object that is written. All ongoing transactions are given this old value until a new value is committed. Transaction switch to new value only when its made available by any of the ongoing transaction commits.

Snapshot Isolation (Repeatable Read)

‘Read Committed’ isolation level can abort a transaction (deliver atomicity), arrests reading from an incomplete transaction(No dirty reads), halts concurrent data modification(No dirty writes), however, it doesn’t prevent a transaction from non-repeatable read or skew read.

A non-repeatable read occurs, when during the course of a transaction, a row is retrieved twice and the values within the row differ between reads.

In Figure-3, ‘client-1’ aims to move ₹ 10000 from ‘account-2’ to ‘account-1’. Just before fund movement, his total balance in each account was ₹ 20000.

If he attempts to get check his balance such that one account reports just before fund movement and another report just after fund movement got completed, then he is bound to get an inconsistent view of his accounts (Account-1 reporting ₹ 20000 and account-2 reporting ₹ 10000). This aberration (Read skew) is accepted under “Read Committed” isolation level but is arrested by “Snapshot Isolation

Figure 3. Non repeatable read or Skew read

Snapshot isolation is the solution to skew reads. Here each transaction gets a consistent snapshot of the database.

This means each transaction sees all the data that was committed at the start of the transaction. If data changes while the transaction is under process (Commits in the same object by different transaction), undergoing transaction still reports the old committed snapshot(Committed object at the start of this transaction)

Snapshot Isolation also takes write locks to prevents dirty writes, however, it doesn’t take read lock (Just like Read Committed). A key principle that snapshot isolation follows is:

The reader never blocks writers and writers never block readers.

This prevents lock contention between reading and writing.

Multi-Version Concurrency Control (MVCC)

Databases working at isolation level keeps several committed versions of an object side by side to prevent dirty reads. This methodology is known as MVCC, multi-version concurrency control. Several committed versions are required to ensure that each concurrent transaction gets a view to most recent commit.

In Figure-4, let’s say the current version of an object is 3 (Version-3). If two concurrent transaction(Transaction-1 and Transaction-2) start reading this object, then both of them will get version-3 as an old snapshot. Now if one of the transaction commits (Transaction-2), then a version-4 of the object will be created.

Ongoing transaction(Transaction-1) still sees version-3. If a new transaction starts now (Transaction-3), then it will take version 4 as an old snapshot.

Later on, Garbage collection process cleans any version of the object that is not in use by any ongoing transaction.

Figure -4. MVCC. Transaction-1 and Transaction-2 start with old snapshot version-3. Meanwhile, Transaction-2 commits and creates version-4.Transaction-3 sees this new version at its startup (Version-4). Transaction-1 keeps version-3 until it commits/aborts.

Serializable

Serializable isolation guarantee is the strongest safety guarantee amongst all isolation level. Even if transactions are executing in parallel, the end result is the same as if they executed in serial order without any race conditions. This guarantees that transactions behave correctly whether they run individually or concurrently.
Though this guarantee comes at the cost of performance which we will see in Part-3 of this series. Snapshot isolation doesn’t forbid phantom reads, which serializable does.

Serialization acquires read and write locks on a range (On data returned by SELECT query) and release it when the transaction gets either committed or rolled back. Range-locks prevents phantom reads.

Figure 5. Phantom reads. First Select query by client-1 gave count as 2, whereas the second query gave count as 1, as some data in range got modified.

Figure-5, shows a phantom read. Snapshot isolation prevents non-repeatable reads by taking a snapshot of the committed row. However, It cannot block other rows which take part in range queries.

A phantom read occurs when, in the course of a transaction, two identical queries are executed, and the collection of rows returned by the second query is different from the first.

Naming confusion of Snapshot Isolation/Repeatable Read/Seriazibility?

Snapshot isolation is quite useful in read-only transactions. However, different databases that implement it, refer it by a different name. Oracle calls it Serializable, In PostgreSQL and MySQL refer it as Repeatable read. What’s really astonishing is that IBM DB2 refers ‘repeatable read’ as ‘Serializability’

The SQL standard doesn’t have the concept of snapshot isolation, instead, they define repeatable read, which looks superficially similar to snapshot isolation. PostgreSQL and MySQL call their snapshot isolation as repeatable read because it meets the requirement of being SQL standard compliant.
In absence of a standard definition, even if a database vendor asserts ‘Repeatable Read’ guarantee, under the hood it can be very different from snapshot isolation.

Summary

  • Read UnCommitted doesn't block dirty reads, but prevent dirty writes
  • “Read Committed” arrests dirty read and dirty write, but fails to hinder non-repeatable reads and phantom reads
  • “Snapshot Isolation” Or “Repeatable reads” prevents non-repeatable reads along with that, it provides safety guarantees as by claimed by “Read Committed” isolation level. It doesn't obstruct phantom reads
  • “Serializability” provides the strongest isolation level guarantee. It prevents phantom reads along with that, provides safety guarantees as by “Repeatable Reads”
Figure 6. Isolation levels vs read phenomena. Source: Wikipedia

That’s all folks from Part-2 of this series.
To know more about A.C.I.D safety guarantees, read Part-1. To upgrade your knowledge on write anomalies in a concurrent environment, read Part-3.

Don’t forget to clap if you liked this article!

--

--