Member-only story
SQL — Understand how indices work under the hood to speed up your queries
No more waiting for slow queries to finish
Ever wondered how a database engine can return data for your queries that fast? How it can search through many tables, and millions of records in a flash? This article explores how the database engine works under the hood and sheds light on how to design your tables and indices in the most optimal way. No more waiting for queries to finish!
As usual we’ll first set up an example, creating some sample data so we have something to work with. Then we’ll check out how the database engine performs without indices. Then we’ll add indices to speed up our queries, demonstrating how you can too. At the end of this article you’ll:
- understand what an index is
- understand the types of indices and their differences
- understand how the indices work
- know in which situations to use which type of index
Note that in this article we’re using SQL Server, but the principle apply to many other relational databases like Postgres and MySQL e.g. The syntax might differ though. Let code!
Setup
For illustrating the code in this article I’ve created a table that a lot of applications use. It holds 20 milion records of User information and can be used to register new users, check passwords when logging in and changing user information. I’ve generated the table with Python, using the superfast insertion method described in this article. The code for creating the table looks like this:
CREATE TABLE [dbo].[Users] (
[Created] DATETIME NOT NULL DEFAULT GETUTCDATE()
, [Modified] DATETIME NOT NULL DEFAULT GETUTCDATE()
, [FirstName] NVARCHAR(100) NOT NULL
, [LastName] NVARCHAR(100) NOT NULL
, [UserName] NVARCHAR(100) NOT NULL
, [Password] NVARCHAR(100) NOT NULL
, [Age] INT NULL
);
As you can see it contains some columns that relate to the user (the last 5) and two columns that keep track of when the record is created and…