Enhancing Performance in our Database with Indexing
By the end of this blog post you should have familiarity with the following: What an index is and when to use them. Let’s get started.
An index is a sorted list of data that points to where something exists. To say it another way, it’s a copy of selected columns of data from a table that can be searched quickly and efficiently. A playful example will help illustrate this point. Let’s imagine Larry and Tom each have a separate collection of books about rare pet lizards. Since veterinarians don’t have working knowledge about rare pet lizards both Larry and Tom must understand how to take care of their lizards as thoroughly as possible. Luckily for Larry, instead of having to go through every page of every book he has on rare pet lizards to find the relevant content, he has indexed shelves by subject, author, and title on rare pet lizards and is able to quickly find what he needs. Tom, however, doesn’t have his books on rare pet lizards sorted in any particular way and consequently is highly inefficient at finding relevant information to help his lizard. Yikes, what a headache!
On a basic level, this is how database indexes work. Indexes are used to filter data. We often will use SQL queries to get a subset of data from our tables and can use an index to get that information at a fast rate, relative to if we had just used an integer value to track that attribute. Indexing seems pretty good, right? Well there’s a catch. Work is a finite resource and every bit of work your database is doing not serving queries is work it can’t spend serving queries. This is may seem obvious, but is all to critical a consideration when thinking about what is worth indexing. Putting an index on specific information is great for quickly accessing said information from our database, aka reading from our database, but can become really slow when adding more information to our database with highly specific parameters, aka writing to our database. So how do you know what should be indexed? As a general rule, we like to index foreign keys on our tables that have information we’ll frequently want to pull from our database. If you’re going to be doing sorts on data in the database, make sure to use indexing! This will make it so your Active Record and SQL queries don’t have to do extra lifting in order to sort the data. A simple example of how an Index can help is included in the following query: Person.order(“added_at DESC”).limit(10)
By indexing the added_at attribute on the Person table, we save our database from having to sort each row as part of the query. Every row is already sorted for that attribute allowing us to very quickly retrieve the data we want.