Using EXPLAIN in Mysql to analyze and improve query performance
Mysql EXPLAIN
statement lets us understand efficiency of our queries and grasp ideas on possible ways to optimize it. Just creating indexes is not sufficient to improve performace, using EXPLAIN
is mandatory to understand if indexes are used correctly and efficiently.
Long story short, EXPLAIN
shows how Mysql executes given query:
EXPLAIN SELECT * FROM c LIMIT 1
This will give us the following output:
From all columns, most important is rows
column, because it shows how many rows Mysql had to process to find answer to our query. Also, key
and key_len
columns will show info on an index used for the query. Let’s take a look on several examples.
Primary key selects
Since we use INNODB
engine and our table has ID
columns which is a primary key, query execution on primary key results in a single row scan:
As we can see, Mysql was able to find our row in a single step by using PRIMARY
key (that’s because clustered indexes are good for primary key lookups). We can also see, that key_len
gives us 8
which is the length of our primary key (because it has single 8-byte bigint
column in it).
Single column queries and indexes
If Mysql was unable to use index for the query, key
column will be empty under EXPLAIN
:
EXPLAIN SELECT count(*) FROM c WHERE country = 'Ukraine'
Which gives us the following:
Here we can see that Mysql scanned almost 100k
rows (which is the size of our table) and used no index to filter data.
Creating single column index
Let’s create an index on a country
column to see what happens with our query:
CREATE INDEX country on c(country);
Now let’s examine EXPLAIN
of the same query:
Now we can see country
(the name of our newely created index) under key
column which is good, because Mysql successfully used this index for query. It resulted in 14
rows being scanned instead of 100k
which is enormous resources saving.
Composite indexes
Let’s now take this query with filtering on 2 columns:
EXPLAIN SELECT * FROM c
WHERE country = 'Ukraine' AND industry = 'banking'
This gives us the following output:
Here we can see, that Mysql used the same index on country
column for the query. But we can see, that it actually selected all 14
rows again and then additionally filtered all of them to see if they meet industry
condition:
That’s why we see Using index condition
under Extra
column of EXPLAIN
. It means Mysql used index only to get first result set, and then had to iterate through each row in that set to check condition on second column.
Creating composite index
We can make more efficient index that Mysql can use for this case — composite index on both columns:
CREATE INDEX country_industry on c(country, industry);
Now EXPLAIN
will give us the following for our previous query:
Mysql used newely created index for this query and it ended up in scanning only 3
rows to get results for the query.
Possible keys and choosing indexes
Mysql will show possible indexes it analyzes before executing query in possible_keys
column of EXPLAIN
statement:
This means, 2 indexes were considered to be used:
country
which was not selected to use,country_industry
which was selected to use.
Mysql does the analysis on its own, but it’s pretty good in picking indexes that will yield in the best query performance.
Partial index usage
Let’s remove country
index (the one built for the single country
column):
DROP INDEX country on c;
Now let’s check how our first query performs:
EXPLAIN SELECT count(*) FROM c WHERE country = 'Ukraine'
And we can see that Mysql was still able to use country_industry
index to get results based on a single country
column value only:
That’s because Mysql is able to use partial index prefix for queries:
But note that it only works if query uses columns from index prefix. In case used columns are not prefixes, Mysql is unable to use index:
EXPLAIN SELECT * FROM c WHERE industry = 'banking'
Which will result in full table scan:
That’s because industry
column is in index, but is not an index prefix:
Composite indexes can be built using any number of columns:
CREATE INDEX filter
ON c(created, shard_page, shard_letter, country, industry)
It is a good idea consider building bigger (containing more columns) indexes to cover more query cases.
Length of used index
We can understand if Mysql used only part of an index by checking key_ley
column value of EXPLAIN
statement. For example, our query that uses both columns of an index:
EXPLAIN SELECT * FROM c
WHERE country = 'Ukraine' AND industry = 'banking';
This will return 1030
for key_len
column which is a full length of an index:
Our other query will return different key_len
value:
EXPLAIN SELECT * FROM c WHERE country = 'Ukraine';
And we can see:
That’s how we know if full index was used for the query or only a part of it.
Partial values usage
Mysql is so cool it can even use a part of a column value in index for queries execution. Let’s assume following query:
EXPLAIN SELECT * FROM c WHERE country LIKE 'Uk%'
As we can see, Mysql is still able to use index:
In this case Mysql will filter results based on partial value of an index column. But the same rule applies here, partial column value should be a prefix or otherwise index can’t be used:
EXPLAIN SELECT * FROM c WHERE country LIKE '%ne'
This will lead to full table scan:
Sorting
Let’s examine ordering query:
EXPLAIN SELECT * FROM c ORDER BY address DESC LIMIT 10
As we can see Mysql is unable to use any indexes for that:
When we see Using filesort
it doesn’t mean that Mysql actually does something with files, it’s just a name of a sorting operation used. But seeing this means Mysql is sorting results on its own (which is usually bad).
Indexes can be used for sorting as well as filtering, so creating the following index will help:
CREATE INDEX address on c(address)
And now query execution is efficient since Mysql used index for sorting:
We can as well mix index column usage in filtering and sorting:
SELECT * FROM c WHERE country = 'Ukraine' ORDER BY industry
And Mysql will be able to use our country_industry
index for that:
Summary
Using EXPLAIN
can help understand why your query is underperforming and check if your indexes are used in the most efficient way. You should first of all check rows
column to understand how much rows Mysql needs to scan to get results. Second is to check key
and key_len
columns to understand if index is used for the query and if full index was used.