Tackling SQL Query Performance

A quick guide to improving the speed of your SQL queries

Diego Garber
Slalom Build
14 min readMar 18, 2021

--

Photo by Guillaume Jaillet on Unsplash

If you are a back-end developer, you may have noticed some of your queries run slowly. Perhaps you don’t write SQL yourself but rather use an ORM (Object Relational Mapper) that is querying the database in a non-optimal fashion.

I will show you why your queries are not executing quickly, and how to solve some of the most common issues you will encounter. This is not intended to be a deep dive, but hopefully this guide will help you tackle some troublesome areas.

Seeks and Scans

The most important concept when analyzing performance is the difference between seeks and scans.

We perform a Scan when we read a table sequentially. It can be a full table scan (checking every single row in the table) or a partial scan. If you are familiar with algorithm complexity, a sequential scan is O(n), which can be time consuming for a large data set.

For example, the average of reads would be 500k in a million row query, as we are equally likely to find it at the beginning or at the end of our search.

In contrast, we perform a Seek by searching for a particular key using a binary search algorithm. In many cases, this will save a lot of time as the algorithm will run much faster than a scan.

Let’s say I have a table that looks like this:

If I don’t know how the table is sorted and I’m looking for a value, I have no better option than to read sequentially. In this table, it can take up to 10 reads as it has 10 rows. This changes when the table is sorted. Consider the following table and imagine that we’re looking for the value 567:

The binary search algorithm starts from the middle of the table (let’s say, row 6) and checks that row’s value. That value is 256. The algorithm knows that the value it is searching for (567) is bigger than 256, and it knows that the table is sorted by this column. The target value is not in rows 1 through 6.

The algorithm’s next step will be to pick the middle row between rows 7 and 10. It will pick row 9, which contains the desired value. You can see how this algorithm is much faster. The resulting complexity of the binary search is O(log(n)); it is much quicker than a sequential search.

For example, in a million row query, the average of reads would be 20.

Most efforts to eliminate scans and utilize seeks should focus on tables with large data sets where the more complex algorithm will have a greater impact.

A Database Example

To highlight query optimization further, I will use the database of a fictitious company called Northwind Traders. The Northwind database is an example database provided by Microsoft and modified by me to highlight performance optimization.

Let’s imagine that I want to know how many orders are being shipped to Argentina. The query is very simple:

select count(*) from orders where ShipCountry = 'Argentina'

What is this query doing, and why doesn’t it return instantly when only one order is being shipped to Agentina? This query is actually doing a scan, which can be slow on big tables for the reasons I mentioned previously.

Visualizing the Northwind database

We can visualize the scan using a database desktop client. I will use SQL Server Management Studio for the purpose of this guide. Click the button labeled “Included Actual Execution plan”. After you execute a query, you will see a new tab called “Execution Plan”:

If you are unfamiliar with Execution Plans, it’s the set of steps that the database engine will execute to get the result that we need. Here you can see the execution plan and observe that 99% of the cost is going into a clustered Index Scan:

If you mouse over the clustered index scan, you can see the Number of Rows Read is 844,960. This represents the number of rows that the engine read for this query. It actually evaluated every single row in that table!

We can fix this query by adding an index on the column ShipCountry. This immediately makes the query execute faster. Execution now takes 2ms instead of 50ms and our new execution plan reveals why:

Here it is, the magic word we’re looking for: “Seek”. If we check, we see Number of Rows Read is now just 16,256. That beats our previous number (more than 800,000) by a lot!

So, what’s an index?

You can think of an index as a smaller copy of the table sorted by the columns specified on the index definition. Indexes can be used to perform seeks instead of scanning full tables. In our previous example, the index saves a copy of ShipCountry and a pointer to the original row in the table.

Keep in mind that write operations will suffer a small performance impact as this extra table needs to be maintained and updated when there is a change to the physical table.

Clustered Indexes

While a “regular” index saves data separated from the actual table, a clustered index does not. A clustered index sorts the table physically. That’s why each table can only have one clustered index, as it’s defining how the data will be actually stored in the table itself.

By default, the table’s primary key will act as a clustered index. This means that the table will physically be sorted by the primary key. So, when you see “Cluster Index Scan”, it means it is going to the physical table and scanning rows.

Setting clustered indexes that reflect the way you’ll access the data, can convert scans into seeks as explained earlier.

Multiple indexes versus More Complex Indexes

A common misconception is that we need as many indexes as columns being searched. Consider the following query:

The database already comes with an index on ShipPostalCode, so there is no need to scan the whole table. Let’s take a look at the execution plan:

The execution plan is hinting that query can be better. There are two seeks happening in parallel, one over the index for ShipPostalCode and another over the index for ShipCountry. This query will merge the two results to only return those that are present in both searches.

The query will read 16,256 rows twice (once for each index), merge them and produce 16,256 rows as they are the same rows. How can we make this faster? If we had an index that contained both columns, we would not need to go through two searches. I’ll recreate my index so it has both columns by running the following command:

We don’t necessarily want to have as many columns as we can on the index creation. The more complex the index, the more overhead there will be when a write on the table occurs. Keep these tradeoffs in mind when creating indexes.

When Full Scans are Helpful

Even though the rule of thumb is to avoid full scans, there are situations where they are helpful or unavoidable.

Very small tables

Let’s say we have a table that only has 2 records, a lookup table. This type of table usually just contains ID + Description. If we performed a query on this table, we would see a full scan in our execution plan but the actual number of rows scanned will be low. An index seek is probably slower than a full scan for a table of this size. This table’s data cardinality is very important as it’s probably slower to perform an index seek on this table than a full scan.

Query fetches all data

Consider the following query:

If there are no filters (no where clauses or joins), there will be no seeks because your query needs to read every row in the table no matter what.

Including Columns in Indexes

When we are looking for a particular row in a table, an index on the columns we’re searching for will be enough to locate that row via an Index Seek. We can then perform an additional lookup on the main table if additional information is needed.

But what happens when we are looking for multiple rows or sorting the information? To use the index would mean that each row found in the index would have to be sought once per row.

This can be extremely expensive; we can mitigate it by including columns in our indexes that are not used for searching but which are returned as part of the output. We use these to avoid lookups on the main table or avoid unnecessary steps.

Consider the following query:

Without any changes, this query will produce the following execution plan:

Now, let’s recreate our index and include the OrderDate column:

We can now execute the prior query and see the difference:

Success! We did not need a sort process after getting our data, keeping our query to around a third of the complexity it had without the include in our index.

Optimizing Index Usage in Database Engine

Even with these index optimizations, the database engine might not apply an index to a query. Following are some of the reasons why this might happen.

Function applied to the indexed column

Consider this example:

This query will produce an Index scan when we want seeks. Check out this screenshot and we see 844k reads!

This scan occurs because our index is not sorted by isnull(ShipCountry, ''). While the query can still use the index as the piece of data from the table is still ShipCountry. However, it cannot perform a seek. Remember that the data needs to be sorted the way we need to be able to perform a seek.

This query can be more efficient by taking a different approach:

You want to make sure you are not applying a function over a column being used by an index, or you’ll miss the opportunity to use a seek operation.

Using a mutable function

Consider the following function:

And the following queries:

These queries will not do the same things despite looking similar. The first one will perform an index scan and the second one an index seek. This happens because the second query contains a user-defined function that SQL Server treats as mutable and evaluates on every row.

We can avoid this problem by calculating the value once, saving it to a variable and using that variable to filter the results:

This prevents the database engine from evaluating the function multiple times.

Ordering Data

Retrieving data ordered by a specific field (or multiple fields) kicks off a sorting process. This sorting process is expensive — if you think about it, most sorting algorithms have a complexity of O(n * log(n)).

Consider the following query:

Currently, we have an index by EmployeeID, but its usage is dependent on what we are trying to get. The query is trying to retrieve all the columns in the table, but our index on EmployeeID only has EmployeeIDand OrderID (the primary key).

The engine would have to scan the index, and then for each row on the index, look at the table for the data (this is called a lookup) to be able to retrieve the other columns. At this point, it may be easier to just sort the table in memory. That’s why this query results in this execution plan:

By contrast, select EmployeeId from orders order by EmployeeID results in a totally different execution plan. It just does an Index Scan, which saves a lot of time and cost.

This last query only needs the index on EmployeeID to get all our data. You can also order by OrderID and still use this index as OrderID is the Clustered Key.

Joining Tables

Taking it a step further, let’s say we are trying to retrieve information from multiple tables in a single query. Let’s imagine that we are trying to select all the orders from our table and joining them with their order details. We will see something like this.

We are executing a clustered index scan in this case, effectively getting all the information in both tables. Instead, let’s imagine that we are only interested in data for one order:

The resulting execution plan would be:

This plan makes sense based on what we saw before. As the execution planner (the part of SQL Server that decides what to execute and in which order) can use the Primary Key (PK) of Orders and part of the PK of Order Details to execute a seek.

Let’s mouse over the clustered index seek from Order details where we can see the predicate (the actual search):

But what happens when we don’t have an index on the column we are searching for? What will we get if we perform the following query?

Interesting! We are getting a bitmap and a probe as well as a suggestion to create an index.

We’ll ignore the index, and try to understand what a bitmap is and how it works with the rest of our query plan to get us the results we’re looking for.

Bitmaps

A bitmap is a structure that is similar to a Dictionary in C# or a HashTable in Java. The engine will apply it by saving a hash on how to retrieve data for a particular key, and the rest of the information it is saving by that key. In this case, the output list that can be seen in the following screenshot:

So, is it saving the whole table in memory? No. It’s saving only what actually fulfills the original filter p.ProductName = ‘Perth Pasties’.

Later, when the clustered index scan runs over Order Details, it can probe that Bitmap and for each row identify if the product belongs to the list of products for the name “Perth Pasties”.

The takeaway is that the “Bitmap and Probe” operation can only be applied because the Products table has a primary key defined that the Order Details table is matching to a column. By creating the correct Primary Key (in this case, ProductId) , we allow the database engine to save in complexity therefore saving us time and processing power.

Data Cardinality

Query planners use information about the cardinality of your data to decide how they are going to execute a particular query. In the previous example, we observed that instead of searching for the product, the query planner created a bitmap on the Products table and then executed an index seek over the Order Details table.

If the query planner found that theIX_Order Details index which contains ProductID had a lot of different records, instead of doing a cluster index scan with a Bitmap probe, it could have used an index seek. Let’s think about the following example:

Let’s say I have a table for People that has 1 million rows, and it has a CountryID column that relates to a Country table that has only 3 rows. Think for a second here: what would be faster: a Bitmap/Probe or a Clustered Index Seek? This is the perfect example for a Bitmap/Probe! As the bitmap will be very small (only 3 records) and it’ll be accessed a lot of times (1 million). That’s why knowing the cardinality of data is so important!

The database engine keeps information about our tables, indexes and statistics. Statistics saved by the engine have information about the distribution of the data. I will not go into detail on statistics, and you can read more about them here.

Using an index on a column that does not contain a lot of different data may not even use the index altogether, because the database engine would have to:

  • search on the index, returning a high amount of rows (as it does not contain a lot of different data)
  • each one of these rows needs to be looked up into the main table to return all the information (the rest of the columns not contained on the index)

It is safe to say that every engine updates the statistics when there is enough change in data since the last time they were updated. Still, many database administrators update statistics periodically, either manual or automatically.

To update the statistics of the whole database in SQL Server, you can execute:

exec sp_updatestats

Please be aware that the first query after the statistics were updated will be slower, therefore use the time of the second run for your metrics.

Heaps

A heap is a table without a Clustered Index. Instead of searching by Clustered Index, they search by RID (Row ID).

Using heaps, the engine can’t perform a seek as the data has no predefined order. It also cannot use bitmaps as it does not have any way to uniquely identify information.

A rule of thumb for heaps: if you are not intentionally trying to do this, avoid it. The only case that I can think why you would use a heap is for storing data where you will always retrieve the whole set (table).

The easiest way to avoid it is to create a clustered primary key. You can find out more about heaps here.

Conclusion

You don’t need to understand every type of data access to write performant queries but you can definitely benefit from understanding the basics. In my experience, more than 95% of under-performing queries are slow because they are performing sequential scans over millions of records.

This article does not cover everything that can impact performance but if you are familiar with these concepts you will save yourself headaches in the future.

Glossary

Binary Search Algorithm:

https://www.upgrad.com/blog/binary-search-algorithm/

Heaps

https://docs.microsoft.com/en-us/sql/relational-databases/indexes/heaps-tables-without-clustered-indexes?view=sql-server-ver15

InnoDB vs MyISAM

Even though this was written for MariaDB, it applies similarly to MySQL.

Northwind Database

You can download the Northwind Database here. You can find instructions in that link. In particular I used this one in the article. In that particular database, I executed the following script to add much more records so performance can be demonstrated more easily:

Statistics

https://docs.microsoft.com/en-us/sql/relational-databases/statistics/statistics?view=sql-server-ver15

--

--

Diego Garber
Slalom Build

Solutions Architect for Slalom Build Chicago currently focused on cloud technologies and .Net Core