Essentials of SQL Server Performance
If you want to know why SQL Server Performance is so important, take a look at my last story (https://medium.com/@ruimigueltcarvalho/why-sql-server-performance-is-important-273baa61bc9f).
To understand better how the SQL Engine works so we can do a better job on performance, I will explain some table and index concepts.
What you need to know about Tables:
We all have worked with SQL tables but many of us have not given so much attention to how SQL server stores the data inside the table.
Typically, the data in the SQL server is stored in the form of a cluster index structure. The data in the table will be stored in a tree-like structure as I show you below:
At the bottom of this tree, we have the leaf nodes and this is where the data rows for the table are stored. Each of these leaf nodes is also known as a page with an 8kb size. Depending on the size of the rows, each page will store an average of thousand rows of data inside. Is important to note that this data is stored in sorted order which determines the cluster key of the table which usually is the primary key of the table.
By having the data in sorted order in these leaf nodes, the SQL server can maintain the tree structure as you see over the top of the data. The top nodes allow SQL Server to go through the tree and find the data we are looking for.
This tree structure that the SQL server uses to store data is very efficient as long we are looking at the data by how the data is stored in the table, the primary key, this structure allows us to find the data with a primary key value very quickly, but what if we want to search by another attribute of the table where I don´t have a clustered index?
As long we are searching for our table ID we don´t need to look at the entire table structure since the data is stored and ordered by ID. But if we are looking at another attribute, the SQL server needs to perform a Scan of the entire data structure, it will read all data pages. These Scan operations are usually expensive in terms of IO and CPU.
To solve this issue we can create an Index on our table to help SQL Server find the data more efficiently.
How to work with Indexes?
A database index helps us to quickly find data in a table when we are searching for data by criteria other than the cluster key that our table is organized by.
For the example that we´ve talked about before, we want to create an index on the table over the attribute that we want to look for. This Index will have the same tree-like structure but will be ordered by the attribute which is the index key. However, this index will not contain data rows, instead, the leaf nodes will contain row pointers back to the table which where the corresponding data in the table is located:
With this, we can have multiple indexes on a table in order to cover the different ways our app might search for data in the table.
So, use an index when your statement needs to search for data by criteria other than the primary key. This is the way to take advantage of this tree-like structure to find the data we need faster, more than scanning every single row.
I hope with this story you have now a better understanding of how SQL server stores data and how you can build indexes to optimize your statements.
I will talk more about SQL performance in the next Storys so make sure to follow and subscribe.
See you in the next one.