PostgreSQL

Optimize PostgreSQL: Minor Version Upgrade Guide

A Journey Through PostgreSQL’s Minor Upgrade Landscape

Anas Anjaria
6 min readMar 3, 2024

This story is about how I upgraded our self-managed production system running PostgreSQL with Timescale database.

When it comes to upgrading PostgreSQL database, there are 2 ways to handle upgrade:

  1. Minor version upgrade.
  2. Major version upgrade.

Minor version upgrades are comparatively easy and backward compatible, whereas major version upgrades are complex as they modify the internal data storage format. It also means that major version upgrades are not backward compatible.

This post focuses on minor version upgrade. There is a dedicated post for major version upgrade 😉.

Cluster Overview

High-level overview of our PostgreSQL cluster

Our cluster consists of 3 nodes in AWS cloud. Each node runs the following components:

  1. PostgreSQL with Timescale DB — Manages time-series data.
  2. Patroni — Ensures high availability with automatic replication and failover.
  3. HAProxy —Acts as a load balancer, distributing write workloads to a leader node and read workloads to replica nodes.

etcd nodes are dedicated AWS EC2 instances.

Now that we understand our cluster’s architecture, let’s dive into the upgrade process

Prerequisites

Before embarking on the upgrade process, it’s imperative to address several key aspects:

  1. Cluster ground-truth — Compile a comprehensive list of all potential use cases supported by the cluster to validate functionality during testing.
  2. Review release notes —Even though minor version upgrades are typically backward-compatible, reviewing release notes is essential to be aware of any changes that could impact our production system.
  3. Compatibility matrix —Ensure compatibility among the various components within the cluster during the upgrade process.
  4. Code changes — Implement necessary code changes to maintain backward compatibility.
  5. Optimal Timeframe — Choose an opportune time for the upgrade to minimize disruptions to our production system.
  6. Inform Customer Success Team — Keep the support team informed ahead of time to manage customer expectations and avoid critical activities during the upgrade.

Compatibility matrix

Our production system initially operated on Spilo release tag 2.0-p7. The goal was to utilize the latest version for all components in our setup.

Preparing this matrix was instrumental in guiding the upgrade process.

| Spilo release tag | PostgreSQL                  | Timescale DB | Patroni | WAL-G |
| ----------------- | --------------------------- | ------------ | ------- | ----- |
| 2.0-p7 | 13.3 | 2.3.1 | 2.0.2 | 1.x |
| 2.1-p1 | 13.4 | 2.4.1 | 2.1.1 | 1.1 |
| 2.1-p2 | 13.4, 14.0 (default) | 2.4.2 | 2.1.1 | 1.1 |
| 2.1-p3 | 13.4, 14.X (default) | 2.5.0 | 2.1.1 | 1.1 |
| 2.1-p4 | 13.6, 14.2 (default) | 2.5.2 | 2.1.3 | 1.1 |
| 2.1-p5 | 13.6, 14.2 (default) | 2.6.1 | 2.1.3 | 1.1 |
| 2.1-p6 | 13.7, 14.4 (default) | 2.7.0 | 2.1.4 | 2.0.0 |
| 2.1-p7 | 13.8, 14.5 (default) | 2.7.2 | 2.1.4 | 2.0.1 |
| 2.1-p9 | 13.9, 15.1 (default) | 2.9.1 | 2.1.5 | 2.0.1 |
| 3.0-p1 | 13.10,15.2 (default) | 2.10.0 | 3.0.1 | 2.0.1 |

It’s crystal clear that we can easily upgrade all the different components to the latest version available without upgrading major PostgreSQL version.

Theoretical knowledge — Typical workflow

Before we get hands-on with the upgrade, let’s understand the theoretical workflow involved:

  1. Rolling Out Desired Minor Version with Replica Nodes — Start deploying the desired minor version with replica nodes.
  2. Activating Write Lock on a Leader Node — To prevent any changes in the database, enforce a write lock on a leader node.
  3. Performing Switchover — Execute a switchover, switching to a new leader node.
  4. Deactivating Write Lock — After a successful switchover, deactivate the write lock.
  5. Dropping Old Leader Node — Remove the node running the old minor version.
IMPORTANT NOTE:

When new replica nodes are rolled out, they still follow the leader node
and do not immediately apply the minor version upgrade.

Practical knowledge — Upgrade in Action

This section provides practical insights and applies the aforementioned workflow to enhance understanding.

Source code: docker-compose.yaml

Form a test cluster

Let’s assume we have a cluster of 2 nodes:

  1. Leader
  2. Replica

running Postgres version 13.10. And our plan is to upgrade it to version 13.13.

For the sake of simplicity, I assume that all the prerequisites are met, and we are good to go.

We can simply form a test cluster by executing the following command.

docker-compose up -d node-1 node-2

We can check the cluster’s health using:

curl localhost:8008/cluster | jq '.'

A healthy cluster consists of a leader node and a replica node.

{
"members": [
{
"name": "a9b6739bc2e5",
"role": "leader",
"state": "running",
"api_url": "http://172.22.0.3:8008/patroni",
"host": "172.22.0.3",
"port": 5432,
"timeline": 1
},
{
"name": "ba60eb3f9110",
"role": "replica",
"state": "streaming",
"api_url": "http://172.22.0.4:8008/patroni",
"host": "172.22.0.4",
"port": 5432,
"timeline": 1,
"lag": 0
}
]
}

Start rolling out desired minor version with replica nodes

We are now rolling out our desired minor version 13.13 with replica nodes and side-by-side removing outdated replica nodes (i.e. the ones running old minor version).

docker-compose up -d node-3 node-4

Since, it’s our test cluster, hence I am adding 2 nodes right away.

However, when it comes to production system, we don’t follow this practice. Typically, we add/remove 1 node at a time and not stress out overall health of a cluster.

Removing an old replica node running the outdated minor version.

docker stop node-2 && docker rm node-2

Now, the cluster has three nodes:

  1. A leader node running old minor version.
  2. Two replica nodes running desired minor version.

Activate write lock on a leader node — To prevent any changes in the database

For our test setup, this step is optional. However, for our production system, it’s a crucial step.

After enforcing write lock, we need to wait until lag is zero. This ensures that all data is in sync with the leader node.

You can enforce a write lock via Patroni’s config endpoint.

curl --request PATCH \
--url http://localhost:8008/config \
--header 'Content-Type: application/json' \
--data '{
"postgresql": {
"parameters": {
"default_transaction_read_only": "on"
}
}
}'

Perform switch over (switching a leader node)

Initiate a switchover to transition to a new leader node. This is done by using Patroni’s switch over endpoint

curl -s http://localhost:8008/switchover \
-XPOST -d '{"leader":"f70e1077d88b"}'
Successfully switched over to "254500f41069"

Deactivate write lock now

Now, let’s deactivate write lock.

curl --request PATCH \
--url http://localhost:8008/config \
--header 'Content-Type: application/json' \
--data '{
"postgresql": {
"parameters": {
"default_transaction_read_only": "off"
}
}
}'

Drop old leader node running old version

Remove the node running old minor version.

docker stop node-1 && docker rm node-1

Congratulations 🎉, you have successfully performed a minor version upgrade.

Testing strategy

To ensure a seamless upgrade process, a meticulous testing strategy is imperative.

Step 1 — Fail fast approach

Test the upgrade process by simply adjusting docker image to see if CI/CD pipeline works as expected.

This helps in identifying any necessary code changes and ensures migration scripts function correctly.

Step 2 — Using small setup

Conduct testing on a small setup locally, similar to the one mentioned in this guide. Once satisfied, replicate the approach on a production-size test cluster.

Step 3 — Using production size test cluster

We store backups in AWS cloud. So, we form a test cluster by restoring backups from AWS cloud.

It’s completely isolated and does not interfere with our production system in any way.

Finally, we apply upgrade process on our production system if everything works as expected.

Conclusion

Upgrading our PostgreSQL with Timescale DB cluster, even for minor versions, is a meticulous yet rewarding process.

By ensuring our cluster’s compatibility, preparing a comprehensive matrix, and executing the upgrade with precision, we not only stay abreast of the latest advancements but also fortify our system with crucial security patches and optimizations.

Remember, the journey doesn’t end with this minor version upgrade. Stay tuned for an upcoming post where we’ll unravel the intricacies of major version upgrades.

If you have any questions or insights to share, feel free to engage in the comment section below.

Happy upgrading! 😄.

--

--

Anas Anjaria

I simplify software engineering by sharing practical lessons and insights. My goal is to help early-career developers grow into proficient Software Engineers.