Dynamic partitioning in PostgreSQL

Timothy Carbone
4 min readSep 10, 2019

--

PostgreSQL 11 comes with a lot of improvements on partitioning like index keys now propagating from master to partitions table. It makes it easier to dynamically generate partitions which will be read transparently for the client.

The problem: default partition accumulating too much data

If we focus on range partitioning for this time, it’s clear that each partition should contain the data that belongs to a specific range of IDs. IDs not belonging to any partition are to be stored in a default partition.

One of the property of IDs is that they’re usually always growing and increasing. When we create partitions, we’re able to cover the whole ID spectrum (from min(t) to max(t)) at the moment of partitioning t. One month later (t+1), it’s likely that the maximum id has increased by a lot and that max(t+1) >> max(t). All the ids between max(t) and max(t+1) are stored in the default partition. This means that if we don’t re-generate new partitions, our default partition will accumulate more and more data and it will ultimately defeat the purpose of having partitions at all.

The solution: distribute default partition rows to new partitions

Once in a while, we need to shrink our default partition by creating new partitions and moving the data out of the default partition and into the corresponding partitions.

We can achieve this either by pulling partitions from the information_schema.tables table or by maintaining a manifest table. For example, the manifest would contain the IDs delimiting all the partitions. When a new ID meets the criteria to form a new partition (ex: ID is 20k IDs away from the maximum partitioned id max(t)), we create a new partition and insert the ID in the manifest. Note that you can use a single manifest to organize multiple partitioned tables by simply adding a field that identifies the table.

The process

Here’s the balancing process. Note that this is a sample with pseudo code but the process can be fully automated through code.

1* Figure out if new partitions need to be created:
We query the manifest and the users table to understand how many partitions are already created and what are the partitions that would need to be created.

WITH max_t (
SELECT MAX(user_id) as user_id
FROM manifest
)
SELECT u.user_id
FROM users u, max_t m
WHERE u.user_id > m.user_id
AND u.user_id % 20000 = 1;

2* Copy the default partition up to max(t+1) to a temporary table:
Every ID in the default table up to max(t+1) will be stored in a newly created partition. We copy them to a temporary table.

CREATE temp_partition AS
(SELECT * FROM partition_default WHERE id < max(t+1);

3* Delete from the default partition where id < max(t+1):
To prevent duplicates and partition collisions, we delete the data that we just copied from the default partition. We’re not allowed to have data that should belong to a partition into another partition. If we don’t delete the data in the default partition, we won’t be able to create the new partitions as they should hold data that belong to the default partition.

DELETE FROM partition_default WHERE id < max(t+1);

4* Create the new partitions:
We create the partitions that need to be created.

CREATE TABLE partition_80000_100000 PARTITION OF master_table;
CREATE TABLE partition_100000_120000 PARTITION OF master_table;
...

Note that since PostgreSQL 11, partitions automatically inherit indexes and primary keys from the master table.

5* Insert the delimiting ids into the manifest table:
We update the manifest with the delimiting IDs of the new partition. If you’re confident that your partitioning strategy is stable and incompressible, you could also only store the max ID.

INSERT INTO manifest VALUES (100000), (120000);

6* Copy the data from the temp table into the corresponding partitions:
We can now copy the rows from the temporary table to their corresponding partitions.

INSERT INTO partition_80000_100000
(SELECT * FROM temp_partition WHERE id BETWEEN 80000 AND 99999);
INSERT INTO partition_100000_120000
(SELECT * FROM temp_partition WHERE id BETWEEN 100000 AND 119999);
...

7* Drop the temporary table:
To go back to a clean state, we drop the temporary table.

DROP TABLE temp_partition;

That’s it, rows from our default partition that should be stored in partitions have been moved. The default partition drastically decreases in volume and queries searching for these rows should now be faster.

Querying dynamic partitions

From a reading client perspective, nothing changes at all. The client will hit the same query on the master table but the performance will be increased because the default partition is now balanced into newly created ranged partitions.

SELECT * FROM master_table WHERE id = 81000 AND id = 104000;

The query planner is smart enough to only hit the corresponding partitions.

--

--