SQL Database

System Design
SystemDesign.us Blog
5 min readAug 14, 2022

Visit systemdesign.us for System Design Interview Questions tagged by companies and their Solutions. Follow us on YouTube, Facebook, LinkedIn, Twitter, Medium, Notion, Quora

What is SQL Database? — Basic Definition

SQL databases have been used widely for decades to store and maintain relational data. An SQL database is a collection of tables to store structured data and their refashions. If you have various entities that are tied together with some forms of relations, you can represent them in a SQL database. All the entities have specific structure to it. You can add, remove, modify and delete entries in each entity.

Where is SQL Database used? — Real World Use Cases

Majority of the distributed systems use SQL database in some form or the other. Here are some real world examples.

Airbnb — For the Airbnb payment system how their team solved Double payment problem. Airbnb Blog

Dropbox — Given the amount of data Dropbox has, they need to maintain transactions across shards at the rate of 10M requests per seconds. Dropbox Link

How does SQL Database work? — Design & Architecture

Each entity is represented as a table and each attribute in an entity is a column. Table can have multiple rows which represent instances of such entities. Attribute type remains the same across all the entity instances. Each instance needs to represent all the attributes of an entity though attributes values may be same, different or null in different cases.

Database can have multiple of such entities. All of these entities are related to each other in some way or other. There may be some entities that are not related but can still be part of the same database.

In the simplest form you can perform Insert, Update, Delete and Read operations.

Multiple operations can be performed on the database as a part of single unit of work. These units of work are called transactions. If there is a failure during this unit of work, database should not get into a bad state. So, transactions have certain ACID properties.

For example, let’s take a system that tracks all the shipping orders. ORDER is an entity which represents all the orders that have been placed in our system and orderId, orderName etc are the attributes of this ORDER entity.

You can have multiple orders in your system. Each row in the table can uniquely represent a particular order.

How tables allow you to access rows efficiently?

An index allows you to access data faster from the table. There are two types of indices clustered and non-clustered. Clustered index decides how the data gets stored and sorted physically so there can only be one clustered index.

Non clustered index creates a separate index while the data is stored somewhere else. If you want to lookup a row using non clustered index, we first lookup the address where the row is stored and then get other necessary columns. This is the reason non clustered index is slower than clustered one.

Key and index are different concepts where key is used to identify stored data with no relation to how the data is stored. In SQL server cases creating Primary key also creates a clustered index.

All operations performed on the database are done as transactions. Transaction is a basic unit of work. Database ensures that all transactions follow ACID properties.

Let’s go through an example,

You have two accounts, both account A and B have balance of $1000. When you transfer $500 from account A to B. After the transaction is complete, expectation is account A will have $500 and account B will have $1500.

Now let’s look at ACID properties,

Atomicity

Atomicity means that a transaction is fully complete or not at all. With the above example atomicity guarantees that either $500 are transferred from A to B account and the amounts reflect accurately. In case of failure the amount in both accounts remain the same.

Consistency

Consistency ensures that the transaction will always bring the database state from one consistent state to another. $500 amount cannot be added or removed without doing the corresponding increment or deduction.

Isolation

Isolation means that a transaction is isolated from other concurrent processes. For example, when the amount is deducted from account A and another transaction reads account A information as $500 but the first transaction is aborted, then the second transaction has read the dirty data. This will take the database in an inconsistent state.

Durability

Durability ensures that even in case of system failures or any crashes the transactions that are committed will be persisted. In case of a failure during the transaction when the server is recovered it will come back to the previous consistent state. This is possible with the help of write ahead log where the committed transactions are recorded in the log file. You can read more about it here, https://social.technet.microsoft.com/wiki/contents/articles/51379.sql-server-transaction-log-internal-architecture-facts-every-sql-server-database-professional-should-know.aspx

What are the benefits of SQL Database? — Advantages

Here are some of the benefits of SQL databases,

Query Flexibility — One of the benefits of SQL database is that you have relations between entities and these entities are structured. This makes it simple to perform complex queries between different tables.

Normalization — In relational databases, process of structuring data is called normalization. With this process you can reduce the data redundancy and improve data integrity.

Transactions — ACID properties of transactions: Atomicity, Consistency, Integrity and Durability guarantee valid transactions. In simple terms either the entire transaction on the database is completed or not at all while keeping the other properties intact.

What are the downsides of SQL Database? — Disadvantages

Rigid Structure — SQL databases have a rigid schema structure. Because of this, system needs to be carefully designed up front.

Scalability — As SQL database does not provide horizontal scalability on its own or it is supported in some ad-hoc way. Scalability in SQL databases usually means you may have to do some sort of sharding manually.

References

  1. https://github.com/donnemartin/system-design-primer
  2. https://www.sqlshack.com/what-is-the-difference-between-clustered-and-non-clustered-indexes-in-sql-server
  3. https://social.technet.microsoft.com/wiki/contents/articles/51484.sql-server-concurrency-control-models-acid-properties-and-transaction-isolation-levels.aspx
  4. https://aws.amazon.com/rds/
  5. https://www.ibm.com/cloud/blog/sql-vs-nosql

Visit systemdesign.us for System Design Interview Questions tagged by companies and their Solutions. Follow us on YouTube, Facebook, LinkedIn, Twitter, Medium, Notion, Quora

--

--