Problems with concurrent Postgres indexes (and how to solve them)
Since I read the article linked below:
How to Create Postgres Indexes Concurrently in ActiveRecord Migrations
By default, Postgres locks writes (but not reads) to a table while creating…
I started using indexes in Postgres using the `algorithm: :concurrently` flag and always thought ‘why is the concurrently flag not the default used by Postgres?’ Now, months after the above post, I found the answer.
In January, we had an increased load on the database which led us to start a book club here in carwow called ‘Postgres Connoisseurs’ (thanks to Pericles, the genius behind the name). Since then, I’ve started reading ‘PostgreSQL 9.0 High Performance’; a slightly outdated book but still the best in its genre.
What does concurrently do?
An index created concurrently will not lock the table against writes, which means you can still insert or update new rows. In order to execute the (slightly slower) index creation, Postgres will do the following:
- Scans the table once to build the index;
- Runs the index a second time for things added or updated since the first pass.
The above should generally work fine, but what happens if we run the following index and we insert a duplicate between the two steps?
CREATE UNIQUE INDEX CONCURRENTLY index_on_users ON users (email)
Postgres will stop the creation of the index and it will be marked as ‘INVALID’; meaning that it won’t be used for queries, but it will still be updated like any other index.
How can I detect invalid indexes?
You can find all invalid indexes by running the following query:
SELECT * FROM pg_class, pg_index WHERE pg_index.indisvalid = false AND pg_index.indexrelid = pg_class.oid;
How can I rebuild invalid indexes?
You have two choices:
- Use the REINDEX command (not suggested);
- Drop the index and try to re-build it again (suggested).
REINDEX won’t re-create the index concurrently, it will lock the table for writes while the index is being created. This is the reason why this solution is not optimal.
With the second option you should drop the index, try to solve the problem that occurred upon index creation (e.g. remove any duplicates) and then re-build the index using the CONCURRENTLY flag again.