Indexing MySQL with Examples
Yesterday one of the mobile engineer in Skyfish asked me how indexes work in RDBMS, SQLite to be exact. He was looking for ways to shave time from a chat app he’s building. I told him the only explanation that makes sense for me is that an index equals the index in any book you’ve read.
I also told him that I don’t know about SQLite that much but I know MySQL. So the conversation continued with some writings on a paper. For this blog post, let’s say that we have an imaginary table of full names like below:
Let’s populate the table with some names.
The only index in that table is the default, the Primary Key.
Let’s do a quick glance on how the EXPLAIN statement in MySQL will help you.
The DESCRIBE and EXPLAIN statements are synonyms. In practice, the DESCRIBE keyword is more often used to obtain information about table structure, whereas EXPLAIN is used to obtain a query execution plan (that is, an explanation of how MySQL would execute a query) — Link
Pretty simple query right? Let’s make it more complex.
Still simple right? Now let’s add the EXPLAIN statement.
The output is saying to us that there are now index keys used and it scanned through the whole table (10 rows) to satisfy the WHERE clause. Let’s optimize this by giving an index to the last name.
Let’s re-run the EXPLAIN query.
As you can see, the query only scanned through 2 rows instead of 10 rows pre-index. It’s also saying that it’s using the last_names index we created for this query.
Question: how do we make sure that our queries are set up for success with the right indexes? To answer this, let’s take a look at another term you should be familiar with.
Cardinality is a property which affects the ability to cluster, sort and search data. It is therefore an important measurement for the query planners in DBs, it is a heuristic which they can use to choose the best plans — Link
So what’s the cardinality of our last_names index then?
The output says 9 while we actually have 10 rows. But we have row 1 and row 2 with the same last_name value: Harahap. This means that other than the unique value Harahap, there are 8 other values for the column. This is why on our last EXPLAIN query, MySQL only scanned through 2 rows which are in fact the only rows with the Harahap value.
However, low or high cardinalities don’t necessarily mean that your indexes are set for success (low latency reads). The success of an index relies more on the dataset and the queries themselves but I’m gonna save it for another day.
As software engineers are taking up more roles in the engineering process, one must be able to appreciate the roles they are taking up.