Best practices for multi-column indexes in relational databases

Michael Lapin
Relbis Labs
Published in
3 min readJun 21, 2022

Multi-column indexes are also known as composite, combined, or concatenated indexes. They are used to improve the speed of data retrieval operations in relational databases. Multi-column indexes are more complex compared to single-column indexes and therefore require extra attention when defining them. Here are the best practices for creating and working with such indexes.

The right column order

Choose a column order that allows an index to be used by as many queries as possible. A query uses a multicolumn index whenever leading columns are included into the search.

As an example, let’s look at the table that contains a telephone directory and is defined as follows:

This table has a multicolumn index on Last_Name and First_Name columns:

Now, let’s look at three queries that use columns on which the index is based:

The first two queries benefit from the index since both of them use the first column that is included in the index.

The last query is not able to use the index because it filters data by the last column.

Just like using a real phone book that has subscribers listed in the alphabetical order first by last name and then by first name. It is easy to find subscribers by last name. It is also easy to find them by last name and first name together. But when it is necessary to find all subscribers with a specific first name, it will be required to scan through all phone book. It will not make any difference whether the first name that is used for a search is unique or not.

The most selective columns first myth

There is a popular belief that the most selective columns must always be the leading columns in an index. In other words, columns with a higher cardinality must appear before columns with lower cardinality in the definition of an index.

This is not true. As discussed earlier, the most important consideration when choosing the order of columns for an index is how this index will be used by queries.

Performance and storage tradeoffs

Database indexes bring improved speed of querying the data at the cost of additional writes and storage space that is used to maintain the data structure of these indexes. Each new index slows down insert, update and delete operations because the system has to keep indexes consistent with data modifications that are made to tables.

This is why it is recommended to create the minimum possible number of indexes with the smallest number of columns.

Originally published at https://labs.relbis.com.

--

--