PostgreSQL at Scale: Database Schema Changes Without Downtime

Braintree Payments uses PostgreSQL as its primary datastore. We rely heavily on the data safety and consistency guarantees a traditional relational database offers us, but these guarantees come with certain operational difficulties. To make things even more interesting, we allow zero scheduled functional downtime for our main payments processing services.

Several years ago we published a blog post detailing some of the things we had learned about how to safely run DDL (data definition language) operations without interrupting our production API traffic.

Since that time PostgreSQL has gone through quite a few major upgrade cycles — several of which have added improved support for concurrent DDL. We’ve also further refined our processes. Given how much has changed, we figured it was time for a blog post redux.

In this post we’ll address the following topics:

First, some basics

For all code and database changes, we require that:

For all DDL operations we require that:

Transactionality

PostgreSQL supports transactional DDL. In most cases, you can execute multiple DDL statements inside an explicit database transaction and take an “all or nothing” approach to a set of changes. However, running multiple DDL statements inside a transaction has one serious downside: if you alter multiple objects, you’ll need to acquire exclusive locks on all of those objects in a single transactions. Because locks on multiple tables creates the possibility of deadlock and increases exposure to long waits, we do not combine multiple DDL statements into a single transaction. PostgreSQL will still execute each separate DDL statement transactionally; each statement will be either cleanly applied or fail and the transaction rolled back.

Note: Concurrent index creation is a special case. Postgres disallows executing CREATE INDEX CONCURRENTLY inside an explicit transaction; instead Postgres itself manages the transactions. If for some reason the index build fails before completion, you may need to drop the index before retrying, though the index will still never be used for regular queries if it did not finish building successfully.

Locking

PostgreSQL has many different levels of locking. We’re concerned primarily with the following table-level locks since DDL generally operates at these levels:

Note: “Concurrent DDL” for these purposes includes VACUUM and ANALYZE operations.

All DDL operations generally necessitate acquiring one of these locks on the object being manipulated. For example, when you run:

PostgreSQL attempts to acquire an ACCESS EXCLUSIVE lock on the table foos. Attempting to acquire this lock causes all subsequent queries on this table to queue until the lock is released. In practice your DDL operations can cause other queries to back up for as long as your longest running query takes to execute. Because arbitrarily long queueing of incoming queries is indistinguishable from an outage, we try to avoid any long-running queries in databases supporting our payments processing applications.

But sometimes a query takes longer than you expect. Or maybe you have a few special case queries that you already know will take a long time. PostgreSQL offers some additional runtime configuration options that allow us to guarantee query queueing backpressure doesn’t result in downtime.

Instead of relying on Postgres to lock an object when executing a DDL statement, we acquire the lock explicitly ourselves. This allows us to carefully control the time the queries may be queued. Additionally when we fail to acquire a lock within several seconds, we pause before trying again so that any queued queries can be executed without significantly increasing load. Finally, before we attempt lock acquisition, we query pg_locks¹ for any currently long running queries to avoid unnecessarily queueing queries for several seconds when it is unlikely that lock acquisition is going to succeed.

Starting with Postgres 9.3, you adjust the lock_timeout parameter to control how long Postgres will allow for lock acquisition before returning without acquiring the lock. If you happen to be using 9.2 or earlier (and those are unsupported; you should upgrade!), then you can simulate this behavior by using the statement_timeout parameter around an explicit LOCK <table> statement.

In many cases an ACCESS EXCLUSIVE lock need only be held for a very short period of time, i.e., the amount of time it takes Postgres to update its "catalog" (think metadata) tables. Below we'll discuss the cases where a lower lock level is sufficient or alternative approaches for avoiding long-held locks that block SELECT/INSERT/UPDATE/DELETE.

Note: Sometimes holding even an ACCESS EXCLUSIVE lock for something more than a catalog update (e.g., a full table scan or even rewrite) can be functionally acceptable when the table size is relatively small. We recommend testing your specific use case against realistic data sizes and hardware to see if a particular operation will be "fast enough". On good hardware with a table easily loaded into memory, a full table scan or rewrite for thousands (possibly even 100s of thousands) of rows may be "fast enough".

Table operations

Create table

In general, adding a table is one of the few operations we don’t have to think too hard about since, by definition, the object we’re “modifying” can’t possibly be in use yet. :D

While most of the attributes involved in creating a table do not involve other database objects, including a foreign key in your initial table definition will cause Postgres to acquire a SHARE ROW EXCLUSIVE lock against the referenced table blocking any concurrent DDL or row modifications. While this lock should be short-lived, it nonetheless requires the same caution as any other operation acquiring such a lock. We prefer to split these into two separate operations: create the table and then add the foreign key.

Drop table

Dropping a table requires an exclusive lock on that table. As long as the table isn’t in current use you can safely drop the table. Before allowing a DROP TABLE ... to make its way into our production environments we require documentation showing when all references to the table were removed from the codebase. To double check that this is the case you can query PostgreSQL's table statistics view pg_stat_user_tables² confirming that the returned statistics don't change over the course of a reasonable length of time.

Rename table

While it’s unsurprising that a table rename requires acquiring an ACCESS EXCLUSIVE lock on the table, that's far from our biggest concern. Unless the table is not being read from or written to, it's very unlikely that your application code could safely handle a table being renamed underneath it.

We avoid table renames almost entirely. But if a rename is an absolute must, then a safe approach might look something like the following:

Other approaches involving views and/or RULEs may also be viable depending on the performance characteristics required.

Column operations

Note: For column constraints (e.g., NOT NULL) or other constraints (e.g., EXCLUDES), see Constraints.

Add column

Adding a column to an existing table generally requires holding a short ACCESS EXCLUSIVE lock on the table while catalog tables are updated. But there are several potential gotchas:

Default values: Introducing a default value at the same time of adding the column will cause the table to be locked while the default value in propagated for all rows in the table. Instead, you should:

Note: In the recently release PostgreSQL 11, this is no longer the case for non-volatile default values. Instead adding a new column with a default value only requires updating catalog tables, and any reads of rows without a value for the new column will magically have it “filled in” on the fly.

Not-null constraints: Adding a column with a NOT NULL constraint is only possible if there are no existing rows or a DEFAULT is also provided. If there are no existing rows, then the change is effectively equivalent to a catalog only change. If there are existing rows and you are also specifying a default value, then the same caveats apply as above with respect to default values.

Note: Adding a column will cause all SELECT * FROM ... style queries referencing the table to begin returning the new column. It is important to ensure that all currently running code safely handles new columns. To avoid this gotcha in our applications we require queries to avoid * expansion in favor of explicit column references.

Change column type

In the general case changing a column’s type requires holding an exclusive lock on a table while the entire table is rewritten with the new type.

There are a few exceptions:

Note: Even though one of the exceptions above was added in 9.1, changing the type of an indexed column would always rewrite the index even if a table rewrite was avoided. In 9.2 any column data type that avoids a table rewrite also avoids rewriting the associated indexes as long as Postgres can verify that the logical sort order remains the same (for example, a collation change on a text column will still require rebuilding indexes). If you’d like to confirm that your change won’t rewrite the table or any indexes, you can query pg_class³ and verify the relfilenode column doesn't change.

If you need to change the type of a column and one of the above exceptions doesn’t apply, then the safe alternative is:

Drop column

It goes without saying that dropping a column is something that should be done with great care. Dropping a column requires an exclusive lock on the table to update the catalog but does not rewrite the table. As long as the column isn’t in current use you can safely drop the column. It’s also important to confirm that the column is not referenced by any dependent objects that could be unsafe to drop. In particular, any indexes using the column should be dropped separately and safely with DROP INDEX CONCURRENTLY since otherwise they will be automatically dropped along with the column under an ACCESS EXCLUSIVE lock. You can query pg_depend for any dependent objects.

Before allowing a ALTER TABLE ... DROP COLUMN ... to make its way into our production environments we require documentation showing when all references to the column were removed from the codebase. This process allows us to safely roll back to the release prior to the one that dropped the column.

Note: Dropping a column will require that you update all views, triggers, function, etc. that rely on that column.

Index operations

Create index

The standard form of CREATE INDEX ... acquires an ACCESS EXCLUSIVE lock against the table being indexed while building the index using a single table scan. In contrast, the form CREATE INDEX CONCURRENTLY ... acquires an SHARE UPDATE EXCLUSIVE lock but must complete two table scans (and hence is somewhat slower). This lower lock level allows reads and writes to continue against the table while the index is built.

Caveats:

Drop index

The standard form of DROP INDEX ... acquires an ACCESS EXCLUSIVE lock against the table with the index while removing the index. For small indexes this may be a short operation. For large indexes, however, file system unlinking and disk flushing can take a significant amount of time. In contrast, the form DROP INDEX CONCURRENTLY ... acquires a SHARE UPDATE EXCLUSIVE lock to perform these operations allowing reads and writes to continue against the table while the index is dropped.

Caveats:

Note: DROP INDEX CONCURRENTLY ... was added in Postgres 9.2. If you're still running 9.1 or prior, you can achieve somewhat similar results by marking the index as invalid and not ready for writes, flushing buffers with the pgfincore extension, and the dropping the index.

Rename index

ALTER INDEX ... RENAME TO ... requires an ACCESS EXCLUSIVE lock on the index blocking reads from and writes to the underlying table. However a recent commit expected to be a part of Postgres 12 lowers that requirement to SHARE UPDATE EXCLUSIVE.

Reindex

REINDEX INDEX ... requires an ACCESS EXCLUSIVE lock on the index blocking reads from and writes to the underlying table. Instead we use the following procedure:

However in Postgres 12+ CONCURRENTLY support has been added to REINDEX which essentially implemented the procedure outline above into a single command. The same caveats apply as the CONCURRENTLY variants of CREATE INDEX and DROP INDEX.

Note: If the index you need to rebuild backs a constraint, remember to re-add the constraint as well (subject to all of the caveats we’ve documented.)

Constraints

NOT NULL Constraints

Removing an existing not-null constraint from a column requires an exclusive lock on the table while a simple catalog update is performed.

In contrast, adding a not-null constraint to an existing column requires an exclusive lock on the table while a full table scan verifies that no null values exist. Instead you should:

Bonus: There is currently a patch in the works (and possibly it will make it into Postgres 12) that will allow you to create a NOT NULL constraint without a full table scan if a CHECK constraint (like we created above) already exists.

Foreign keys

ALTER TABLE ... ADD FOREIGN KEY requires a SHARE ROW EXCLUSIVE lock (as of 9.5) on both the altered and referenced tables. While this won't block SELECT queries, blocking row modification operations for a long period of time is equally unacceptable for our transaction processing applications.

To avoid that long-held lock you can use the following process:

Check constraints

ALTER TABLE ... ADD CONSTRAINT ... CHECK (...) requires an ACCESS EXCLUSIVE lock. However, as with foreign keys, Postgres supports breaking the operation into two steps:

Uniqueness constraints

ALTER TABLE ... ADD CONSTRAINT ... UNIQUE (...) requires an ACCESS EXCLUSIVE lock. However, Postgres supports breaking the operation into two steps:

Note: If you specify PRIMARY KEY instead of UNIQUE then any nullable columns in the index will be made NOT NULL. This requires a full table scan which currently can't be avoided. See NOT NULL Constraints for more details.

Exclusion constraints

ALTER TABLE ... ADD CONSTRAINT ... EXCLUDE USING ... requires an ACCESS EXCLUSIVE lock. Adding an exclusion constraint builds the supporting index, and, unfortunately, there is currently no support for using an existing index (as you can do with a unique constraint).

Enum Types

CREATE TYPE <name> AS (...) and DROP TYPE <name> (after verifying there are no existing usages in the database) can both be done safely without unexpected locking.

Modifying enum values

ALTER TYPE <enum> RENAME VALUE <old> TO <new> was added in Postgres 10. This statement does not require locking tables which use the enum type.

Deleting enum values

Enums are stored internally as integers and there is no support for gaps in the valid range, removing a value would currently shifting values and rewriting all rows using those values. PostgreSQL does not currently support removing values from an existing enum type.

Announcing Pg_ha_migrations for Ruby on Rails

We’re also excited to announce that we have open-sourced our internal library pg_ha_migrations. This Ruby gem enforces DDL safety in projects using Ruby on Rails and/or ActiveRecord with an emphasis on explicitly choosing trade-offs and avoiding unnecessary magic (and the corresponding surprises). You can read more in the project’s README.

Footnotes

[1] You can find active long-running queries and the tables they lock with the following query:

[2] You can see PostgreSQL’s internal statistics about table accesses with the following query:

[3] You can see if DDL causes a relation to be rewritten by seeing if the relfilenode value changes after running the statement:

[4] You can find objects (e.g., indexes) that depend on a specific column by running the statement:

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store