PostgreSQL
Optimize PostgreSQL: Minor Version Upgrade Guide
A Journey Through PostgreSQL’s Minor Upgrade Landscape
This story is about how I upgraded our self-managed production system running PostgreSQL with Timescale database.
When it comes to upgrading PostgreSQL database, there are 2 ways to handle upgrade:
- Minor version upgrade.
- Major version upgrade.
Minor version upgrades are comparatively easy and backward compatible, whereas major version upgrades are complex as they modify the internal data storage format. It also means that major version upgrades are not backward compatible.
This post focuses on minor version upgrade. There is a dedicated post for major version upgrade 😉.
Cluster Overview
Our cluster consists of 3 nodes in AWS cloud. Each node runs the following components:
- PostgreSQL with Timescale DB — Manages time-series data.
- Patroni — Ensures high availability with automatic replication and failover.
- HAProxy —Acts as a load balancer, distributing write workloads to a leader node and read workloads to replica nodes.
etcd
nodes are dedicated AWS EC2 instances.
Now that we understand our cluster’s architecture, let’s dive into the upgrade process
Prerequisites
Before embarking on the upgrade process, it’s imperative to address several key aspects:
- Cluster ground-truth — Compile a comprehensive list of all potential use cases supported by the cluster to validate functionality during testing.
- Review release notes —Even though minor version upgrades are typically backward-compatible, reviewing release notes is essential to be aware of any changes that could impact our production system.
- Compatibility matrix —Ensure compatibility among the various components within the cluster during the upgrade process.
- Code changes — Implement necessary code changes to maintain backward compatibility.
- Optimal Timeframe — Choose an opportune time for the upgrade to minimize disruptions to our production system.
- Inform Customer Success Team — Keep the support team informed ahead of time to manage customer expectations and avoid critical activities during the upgrade.
Compatibility matrix
Our production system initially operated on Spilo release tag 2.0-p7
. The goal was to utilize the latest version for all components in our setup.
Preparing this matrix was instrumental in guiding the upgrade process.
| Spilo release tag | PostgreSQL | Timescale DB | Patroni | WAL-G |
| ----------------- | --------------------------- | ------------ | ------- | ----- |
| 2.0-p7 | 13.3 | 2.3.1 | 2.0.2 | 1.x |
| 2.1-p1 | 13.4 | 2.4.1 | 2.1.1 | 1.1 |
| 2.1-p2 | 13.4, 14.0 (default) | 2.4.2 | 2.1.1 | 1.1 |
| 2.1-p3 | 13.4, 14.X (default) | 2.5.0 | 2.1.1 | 1.1 |
| 2.1-p4 | 13.6, 14.2 (default) | 2.5.2 | 2.1.3 | 1.1 |
| 2.1-p5 | 13.6, 14.2 (default) | 2.6.1 | 2.1.3 | 1.1 |
| 2.1-p6 | 13.7, 14.4 (default) | 2.7.0 | 2.1.4 | 2.0.0 |
| 2.1-p7 | 13.8, 14.5 (default) | 2.7.2 | 2.1.4 | 2.0.1 |
| 2.1-p9 | 13.9, 15.1 (default) | 2.9.1 | 2.1.5 | 2.0.1 |
| 3.0-p1 | 13.10,15.2 (default) | 2.10.0 | 3.0.1 | 2.0.1 |
It’s crystal clear that we can easily upgrade all the different components to the latest version available without upgrading major PostgreSQL version.
Theoretical knowledge — Typical workflow
Before we get hands-on with the upgrade, let’s understand the theoretical workflow involved:
- Rolling Out Desired Minor Version with Replica Nodes — Start deploying the desired minor version with replica nodes.
- Activating Write Lock on a Leader Node — To prevent any changes in the database, enforce a write lock on a leader node.
- Performing Switchover — Execute a switchover, switching to a new leader node.
- Deactivating Write Lock — After a successful switchover, deactivate the write lock.
- Dropping Old Leader Node — Remove the node running the old minor version.
IMPORTANT NOTE:
When new replica nodes are rolled out, they still follow the leader node
and do not immediately apply the minor version upgrade.
Practical knowledge — Upgrade in Action
This section provides practical insights and applies the aforementioned workflow to enhance understanding.
Source code: docker-compose.yaml
Form a test cluster
Let’s assume we have a cluster of 2 nodes:
- Leader
- Replica
running Postgres version 13.10
. And our plan is to upgrade it to version 13.13
.
For the sake of simplicity, I assume that all the prerequisites are met, and we are good to go.
We can simply form a test cluster by executing the following command.
docker-compose up -d node-1 node-2
We can check the cluster’s health using:
curl localhost:8008/cluster | jq '.'
A healthy cluster consists of a leader node and a replica node.
{
"members": [
{
"name": "a9b6739bc2e5",
"role": "leader",
"state": "running",
"api_url": "http://172.22.0.3:8008/patroni",
"host": "172.22.0.3",
"port": 5432,
"timeline": 1
},
{
"name": "ba60eb3f9110",
"role": "replica",
"state": "streaming",
"api_url": "http://172.22.0.4:8008/patroni",
"host": "172.22.0.4",
"port": 5432,
"timeline": 1,
"lag": 0
}
]
}
Start rolling out desired minor version with replica nodes
We are now rolling out our desired minor version 13.13
with replica nodes and side-by-side removing outdated replica nodes (i.e. the ones running old minor version).
docker-compose up -d node-3 node-4
Since, it’s our test cluster, hence I am adding 2 nodes right away.
However, when it comes to production system, we don’t follow this practice. Typically, we add/remove 1 node at a time and not stress out overall health of a cluster.
Removing an old replica node running the outdated minor version.
docker stop node-2 && docker rm node-2
Now, the cluster has three nodes:
- A leader node running old minor version.
- Two replica nodes running desired minor version.
Activate write lock on a leader node — To prevent any changes in the database
For our test setup, this step is optional. However, for our production system, it’s a crucial step.
After enforcing write lock, we need to wait until lag
is zero. This ensures that all data is in sync with the leader node.
You can enforce a write lock via Patroni’s config endpoint.
curl --request PATCH \
--url http://localhost:8008/config \
--header 'Content-Type: application/json' \
--data '{
"postgresql": {
"parameters": {
"default_transaction_read_only": "on"
}
}
}'
Perform switch over (switching a leader node)
Initiate a switchover to transition to a new leader node. This is done by using Patroni’s switch over endpoint
curl -s http://localhost:8008/switchover \
-XPOST -d '{"leader":"f70e1077d88b"}'
Successfully switched over to "254500f41069"
Deactivate write lock now
Now, let’s deactivate write lock.
curl --request PATCH \
--url http://localhost:8008/config \
--header 'Content-Type: application/json' \
--data '{
"postgresql": {
"parameters": {
"default_transaction_read_only": "off"
}
}
}'
Drop old leader node running old version
Remove the node running old minor version.
docker stop node-1 && docker rm node-1
Congratulations 🎉, you have successfully performed a minor version upgrade.
Testing strategy
To ensure a seamless upgrade process, a meticulous testing strategy is imperative.
Step 1 — Fail fast approach
Test the upgrade process by simply adjusting docker image to see if CI/CD pipeline works as expected.
This helps in identifying any necessary code changes and ensures migration scripts function correctly.
Step 2 — Using small setup
Conduct testing on a small setup locally, similar to the one mentioned in this guide. Once satisfied, replicate the approach on a production-size test cluster.
Step 3 — Using production size test cluster
We store backups in AWS cloud. So, we form a test cluster by restoring backups from AWS cloud.
It’s completely isolated and does not interfere with our production system in any way.
Finally, we apply upgrade process on our production system if everything works as expected.
Conclusion
Upgrading our PostgreSQL with Timescale DB cluster, even for minor versions, is a meticulous yet rewarding process.
By ensuring our cluster’s compatibility, preparing a comprehensive matrix, and executing the upgrade with precision, we not only stay abreast of the latest advancements but also fortify our system with crucial security patches and optimizations.
Remember, the journey doesn’t end with this minor version upgrade. Stay tuned for an upcoming post where we’ll unravel the intricacies of major version upgrades.
If you have any questions or insights to share, feel free to engage in the comment section below.
Happy upgrading! 😄.
If enjoy this post, you might also like: