Yet another Postgres migration: updating primary key type to bigint

Published in

PandaDoc Tech Blog

7 min readMar 10, 2022

Problem statement

You have a table with a primary key constraint on a single integer column (let it be id) and number of records is close to (or will reach soon) ~2b. In this case, column values will soon be overflowed, and your database will start rejecting new insertions:

The types smallint, integer, and bigint store whole numbers, that is, numbers without fractional components, of various ranges. Attempts to store values outside of the allowed range will result in an error.

To avoid this scenario, you have to migrate column type to bigint. If the column has a default produced by a sequence (like default auto increment sequence), you have to migrate the sequence as well.

Apart from migrating with no downtime, it is a good practice to implement your scripts in a “pause & resume” manner for:

easier testing,
and ability to safely rerun it (it may take hours or days to finish, we can stop it in high load periods to not overwhelm database resources).

This can be achieved by using … IF NOT EXISTS … on objects creation, dropping existing objects if they exist, etc. You will see such blocks in the examples below.

Testing

For the testing purposes, it’s good to follow the same conditions as in production:

Run the script against the same (or almost the same) table volume.
Run the script against the table under the load (i.e. being used by an app).

The suggestion is to use the following snippets:

1. Create a primitive table (we will run our migration against it in future):

2. Populate the table with enough data (e.g. 10m records — this will be enough):

3. Run a script, which will be reading and inserting records into the table quite often. The script will show the moments in time when it took too long to process the request or simply failed:

Straightforward solution

A straightforward solution would be to alter the existing column to a new type. It doesn’t work for big tables as such operation blocks the table against writes:

This command will have to rewrite the whole table, because bigint takes 8 bytes of storage rather than the 4 of an integer. The table will be locked from concurrent access while this is taking place, so with a big table you should be prepared for a longer downtime.
If you expect that this could be necessary, perform the change as soon as possible, while the table is still small.

Note about serial types

PostgreSQL documentation mentions SERIAL types in the following way:

The data types smallserial, serial and bigserial are not true types, but merely a notational convenience for creating unique identifier columns (similar to the AUTO_INCREMENT property supported by some other databases). In the current implementation, specifying:
CREATE TABLE tablename ( colname SERIAL );
is equivalent to specifying:
CREATE SEQUENCE tablename_colname_seq AS integer; CREATE TABLE tablename ( colname integer NOT NULL DEFAULT nextval('tablename_colname_seq') ); ALTER SEQUENCE tablename_colname_seq OWNED BY tablename.colname;

As “serial” is just a structure unwrapping into a specific code on column creation, we cannot make column “serial”. However, trying to alter it to the “serial” type will block the table against writes, as it will start physically inserting default values of a new sequence to the existing rows.

In place migration

A possible option to migrate the type is to split the required changes into small parts and handle them separately.

The algorithm consists of the four steps:

Create a new column with the required type (bigint). This can be done in two ways:
— with NOT NULL DEFAULT <CONST> directive - this works with no downtime for PostgreSQL 11+ (in the article we will go on with this approach),
— make the column nullable, which will require additional “not null” constraint in future for primary key constraint.
Duplicate id values to the new column.
— For the new insertions we can setup a trigger or handle this in application code.
— For existing values we will do this in batches with a sleep in between (why you need batches).
Create a unique index on the new column concurrently. This will be required to create a primary key constraint in the future.
Promote the new column as a new primary key.

New column creation

Usually, if we want to make a column non-nullable or apply a unique constraint, we modify application code to handle such cases (e.g. provide the same value for the target column as in the source one). But for autogenerated primary key, this may not be the case when the database handles value generation. We can, though, call the underlying sequence or function to generate the primary key value before inserting new records, but this results in additional call to the database.

Another possible approach is to create a trigger on insert and update operations to duplicate the primary key value into the target column right when the new column is created. This gets us rid of modifying application code at all.

A trigger function to duplicate a primary key value into a new column. We use “IF NOT EXISTS” construction for “pause & resume”

Column values duplication

Update in batches is pretty straightforward:

But the main note here is about the ability to replay this step, as this is the longest part of the migration. Without any specific preparation, rerunning the migration will not reuse previous work, and thus not reduce its execution time.

There are couple ways of implementing this:

The first approach will rely on some configuration parameter (e.g. environment variable, provided by a developer, or temporarily stored value in the database) to know the last updated record and start update from it.
The second approach doesn’t require any variables — it will run through the table records in primary key ascending order and find the first nullable new_column value (if the column is nullable) or the first value not equal to the default constant specified at column creation time (such default constant should not be equal to any existing records!). Here we scan the same amount of data as there was updated, but we do a faster “read” operation instead of “update”.
Anything else able to detect updated or non-updated records.

Unique index creation

Primary key constraint can be split into two constraints:

NOT NULL
UNIQUE

Non-nullable constraint is enforced by column creation or adding a special constraint (if the column was created without a constant default). The remaining condition is values uniqueness.

Zero-downtime unique index creation is done via the following construction:

Index is dropped concurrently so that it doesn’t lock the table against writes:

Drop the index without locking out concurrent selects, inserts, updates, and deletes on the index’s table. A normal DROP INDEX acquires an ACCESS EXCLUSIVE lock on the table, blocking other accesses until the index drop can be completed. With this option, the command instead waits until conflicting transactions have completed.

Concurrent creation of index will wait until the script scans all records of the table twice. Loop verification that the index is valid is necessary for long-running transactions that started before index creation and finished after it. More on that can be found in a “Building index concurrently” documentation.

Promote new primary key

The final puzzle piece is a promotion of a new column as a new primary key. In general, we would like to end up with the same name (id) as a primary key. To do so, we should perform the following steps:

Cleanup temporary objects (e.g. trigger).
Rename the new column to id.
Update default sequence to bigint type.
Add a primary key constraint on the fresh id column using unique index (and not null constraint if you chose to create a new column without a constant default value).

Here is a SQL script for this (note that it should be executed in a single transaction):

Post release

When the database updates so many records (under the hood it fills in new memory blocks), it leaves empty blocks of memory used by old data “dead”, meaning that the system will not reuse such space to insert new information. To enable the database use these blocks again, we need to execute VACUUM command, which will collect outdated memory blocks and make them available for insertion. This operation doesn’t block regular database operations.

To help query planner build more performant plans for queries, we should run ANALYZE as well. VERBOSE will additionally print statistics of each table vacuumed.

And we can combine these two:

VACUUM (VERBOSE, ANALYZE) test_table;

Immutable migration

Another possible way of solving such problem is an immutable migration. Immutability is thought in a sense that we don’t affect existing table anyhow, instead we create a new table (a deep clone of the original one) with the required parameters (in our case — with bigint primary key) and then switch tables swapping their names.

The algorithm consists of the following steps:

Create a new table with the required changes.
Duplicate new insertions/updates/deletes to the original table into the new table by triggers.
Copy the rows of the original table existed before the the two steps above to the target table.
Swap table names.

For append-only tables this may be even easier — the only type of trigger we need to create is an insertion one.

Example table migration

Note: this may not work in all cases, e.g. for very specific constraints, which can prevent duplicating successful operations to the original table or copying rows by the trigger.

A benchmark showed that such type of migration is as speedy as the in place one. The duration of running the immutable migration was equal to the duration of updating the rows in place, but the latter includes an index build in the final steps. So this may be a good alternative.

Happy migration!