Isolation levels in a Database

Sindhura Venkatesh
Designing distributed systems
3 min readDec 30, 2019

Consistency and Isolation are two fundamental properties of any database system. Based on the database’s architecture, they can significantly affect correctness, latency, and throughput. Let us dive in and take a closer look as to what they are and how they factor into a DB’s performance.

Isolation levels and their Impact on performance

What is Isolation?

Isolation is the ability of a database to execute a given transaction as if no other transactions are concurrently running, even though, in reality, there might be thousands of such transactions.

Why is Isolation necessary?

Consider two transactions executing on a key-value store, which are trying to modify the same key. If we were to allow these transactions to run in a random order, without providing any isolation guarantee, the key would have the value written by the latter transaction, and it would be like the other transaction never happened. This will severely impact the correctness of the database.

To avoid such situations, we need to execute these concurrent transactions, as if they were executing serially (one after another). In other words, we need to guarantee that our sequence of execution will result in the same final DB state as that of serial execution of transactions. This isolation guarantee is called “perfect isolation” or serializability. It can be achieved by different mechanisms such as locking, validation, etc.,

Why do we need other isolation levels when we have the perfect one?

Imagine if you were to implement serializability through locking, in a file system. Each time you were reading or writing to a file, you would block other people from accessing it. Once you’re done, you release the lock. This works great if only a few users are using it, but what happens when there are hundreds? Thousands? Your latency, throughput, and scalability take a huge hit.

Imagine if you can’t access your Facebook wall on your birthday because a dear friend of yours is writing a very long wish? A lot of people would be angry at Facebook.

In essence, many times, we can’t trade performance for correctness, also implementing serializability may be non-trivial too. Hence we can make use of the following weaker isolation levels to offer higher performance.

Other Isolation levels

Read Uncommitted: The lowest isolation level. It allows for a transaction to read the data modified by another transaction before it is committed. Allows for dirty reads and writes, resulting in inconsistent database states.

Read Committed: Any data that will be read from the database has been committed. It doesn’t allow for reading intermediate, uncommitted data. However, other bugs like the non-repeatable read or phantom read anomalies can occur.

Repeatable Read: In addition to the guarantees offered by “Read Committed,” this ensures that the data once read by a transaction will not change throughout its course. Hence, multiple reads in the transaction will see the same value.

By implementing Repeatable Read, we’ve come pretty close to serializability. In addition to offering the guarantees by Repeatable Read, Serializability goes a step ahead by guaranteeing that subsequent reads in the transaction won’t be able to see any new data.

The above post was just a 101 on different isolation levels and their significance in the performance of a database system. Choosing which standard to adopt can be very subjective based on the importance given to correctness and performance in your application. I hope the above post helped in making your tech life less stressful. Watch out for more posts on databases and distributed systems!

--

--