SQL Server checklist for better performance

Agus Velazquez
7 min readJan 25, 2022

Before scratching your head, see if you check all the buckets!

Many times I found myself writing long scripts of store procedures or aggregations using multiple tables, joins, summarizations, windows functions, etc., and at once you finish you ended up with a looong query that takes forever to run and you need to start using the WHERE clause everywhere to limit the data so you can see what actually was doing.

Before anything, invest some time in optimizing your query performance, this will save up sometime later on the road.

Here, there are some bullet points from a rookie that will save you some time.

Where to look at first

So the first one is the most intuitive one, you need to know where you have your biggest waiting time. 80/20 rule!!

1. Understand top resources consumers.
There are a couple of ways to do this but I think the most user-friendly way is to use the Query Store, you can enable this in the Database properties. By default, the Query Store has a view with the Top 25 resource consumers for the database by different metrics, including the duration.

Understand the data model

2. Every table should have a primary key.
By default, I think every table should have a primary key. A PK is a unique identifier, enforces a constraint, allows you to create FK, cannot allow NULL values, and creates as default a clustered index (unless you explicitly specify the nonclustered option). All this combo may improve the performance of your query. You can use a natural key or a surrogate one but be sure to not take too much space, this will affect all non-clustered keys too.

Even though the main benefit is data integrity, PK does not allow any duplicates or NULL values, this will help if data is not good.

3. Set tables relationship using a foreign key.
FK may help to create more efficient query plans by giving more information to the optimizer due to the fact they are pre-declared rules. SQL Server does not create an index in an FK automatically.

FK constraints also help catch missing data issues when doing refreshes or loading new data.

4. Use NOT NULL and UNIQUE CONSTRAINS if possible.
It will give extra information to the Query Optimizer and reduce the logical and physical reads. You may explore other constraints like DEFAULT or CHECK.

5. Highly normalized database may slow down the performance.
Normalization is necessary and can have many advantages but in some cases, a certain amount of denormalization may improve database speed by reducing the number of tables and joins.

6. Use column-store index on large fact tables.
Use this type of index on fact tables where you perform aggregations or select only a few columns. You can have row-store indexes on a column-store table to perform efficient table seeks then.

Fine-tune queries

7. Replace SELECT * by naming only the necessary columns.
It may be a massive waste of resources if you only require just a few columns.

When doing an INSERT, SELECT INTO, CREATE TABLE AS or even using VIEWS, you don’t control the number and order of columns that are going to be selected and inserted. Some of these processes may fail silently if someone changes the table definition.

8. Avoid using wild cards.
If it is not possible try to use it only at the end, this will allow the optimizer to use the index on the searched column.

9. Replace functions to convert mismatched data types.
Every time you use functions like “cast” or “convert” you are using a modification of the column instead of the column so the index won’t apply to that modification.

10. Avoid using SELECT DISTINCT.
Distinct is an expensive operation. Check if you can redesign your data model. Sometimes it can be replaced by a GROUP BY that may be faster in some cases.

11. Use UNION ALL instead of UNION if possible.
UNION retrieves only distinct records from all queries or tables, whereas UNION ALL returns all the records retrieved by queries. If you don’t need to check for duplicates just use UNION ALL.

12. Use NOT EXISTS/EXISTS instead of NOT IN/IN.
NOT EXISTS and NOT IN do not have the same behavior when there are NULLs involved. But when the columns allow nulls but have none, the NOT IN performs significantly worse than NOT EXISTS.

13. Limit the use of temporary tables.
Execution time is negatively impacted because of the extra I/O operations to READ the initial data, WRITE the data to the temporary table, READ data from the temporary table, and lastly drop the temporary table. Nonetheless, they may be a good option to replace CTEs that are used multiple times.

Execution plans and indexing strategy

Once you have identified the queries you want to optimize, run an Estimated Execution Plan and then an Actual Execution Plan. You need to concentrate your efforts on the costly operator based on the relative percentage.

Inputs and outputs of the Query Optimizer

14. Avoid Table or Index Scans.
It may represent an unnecessary scanning of data, nonetheless, an index scan is more efficient than a table scan by comparison. You should look for Index seek operators, they are the fastest way to retrieve data.

15. Avoid Sort operators if not necessary.
ORDER BY may increase the cost of a query considerably. Avoid them if not necessary or use the order of the clustered index if possible.

16. Create a clustered index.
When you create a PK, SQL Server automatically creates a clustered index on that PK. You should be already covered if you check the previous item on PKs.

17. Create a non-clustered index.
If your query has joins on PK and FK you should create an index on the FK. If you use a where, order by, or group by, they are candidates for an index.

18. Use covering index.
Use the INCLUDE statement to add more columns if the performance is not good enough. A cover index, by design, will include attributes that may not be utilized as predicates, but rather to satisfy the data requirements of the query. Just bear in mind that indexes require maintenance and slow down write operations.

19. If the index is a composite or multi-column index, make sure you are referencing the left leading column in the index.
Otherwise, Query Optimizer won’t use the index at all and will use a full table scan.

20. Set up a job to rebuild/reorganize your indexes on a regular basis.
An index may become fragmented, so queries run poorly, then the index gets rebuilt so run well again but then get fragmented again, and so on. Inserts may fragment the index quickly. Rebuild indexes when fragmentation exceeds 30 percent. You can check in sys.dm_db_index_physical_stats.

21. Drop unused index.
Query the table sys.dm_db_index_usage_stats and replace or eliminate those that don’t have any seek operator, scan, or lookup.

22. Set up a job to update statistics on a regular basis.
By default, SQL will update statistics (you can check if the option is enabled) but only when there are a considerable amount of changes. It is advisable to have a job to update stats.

Extra

23. Select an appropriate SQL Edition.
Compare with the documentation on the following link what SQL edition you have. Within each edition, you have a Basic, Standard, or Premium tier. Each of them has a different allowable database size and performance level.

24. Create jobs using SQL Server Agent.
Depending on your edition, you can create simple jobs using the SQL Server Agent to run stored procedures on a set schedule either every day or during off-hours. You can also perform this using almost any kind of cloud provider, such as GCP, Amazon, Azure, etc.

25. Check for blocks in your queries.
Sometimes a query may be blocked by another user or program. If that is the case, you won’t be able to run the query until it is liberated. There are a bunch of ways to identify blockings, one of them could be to query the sys.dm_exec_requests where blocking_session_id <> 0.

The goal of this post was to give you a comprehensive list of things to check/apply to improve your T-SQL queries performance based on my experience. It was not the intention to provide you with all the theory for each bucket instead, gives you headlines as reminders or topics to further explore.

I am, by no means, a SQL Expert. Just wanted to summarize my experience using SQL as a Data Analyst or Engineer (sometimes).

Spend some time on these items, it's worth the shot!

Stay cool,
Agustin.

Here is the cheat sheet/bucket list:

--

--

Agus Velazquez

Industrial Engineer | MSc Statistics | Analytics Engineer