Adding a Postgres High-Availability database to your Kubernetes cluster

In this article I describe the process for installing a Postgres database in a High Availability (HA) configuration into your Kubernetes cluster using an NFS file share for persistency.

14 min readJan 25, 2024

PostgreSQL, or just Postgres, is a popular, open-source database. Its feature set is comprehensive enough to use in an enterprise production environment and it is available on most Operating Systems.

In this article I will show you how to install Postgres in a High-Availability (HA) configuration on a Kubernetes database. If you have followed my other articles, you will now have a 3 node Kubernetes cluster with an NFS server and others. The servers are all based on the 22.04 Ubuntu linux Operating System (OS).

What we will create

The picture above shows the Postgres database as a single installation but this will not what we will deploy. The picture below shows what the High Availability (HA) installation looks like.

This High Availability (HA) deployment deploys two copies of Postgres, each with their own storage. One of the copies is assigned the Master and provides read/write transactions whilst the other Replica is read only.

Any updates to the Master are sent over to the Replica to keep it in sync.

If the Master fails, the Replica will be made the Master. The Replica is known as a hot standby and the switch of the Master to the Replica is known as a failover. As this is in Kubernetes, failure of the Master will likely result in a new instance being scheduled (created) and this will then be a Replica as there can only be one Master.

Three Kubernetes Services will be created:

rw — for all read and write transactions through the Master
ro — for read only transactions only from the Replicas
r — for read only transactions from any copy

Assuming the applications are microservices, you can decide which Service each microservice connects to. Failure of a Postgres instance automatically updates the rw Service to point to the new Master without the microservice even knowing.

Now we will look at how to set this up.

CloudNativePG Kubernetes operator

When you deploy applications into a Kubernetes cluster, you need to define a set of manifest files and then apply them to the cluster. To help manage all these files, some applications have Helm charts that bring together all the manifest files into a single file.

Even if you use a Helm chart, there is still the need to manage the deployment, upgrades and operational procedures, like taking backups and upgrading. A Kubernetes operator is an application that is deployed into the cluster that automates many of the operations you would normally undertake yourself.

There are many operators available to manage Postgres deployments in Kubernetes. We will use CloudNativePG, which provides a wide range of options and settings for Postgres.

Krew

Part of the CloudNativePG deployment relies on using a cnpg plugin extension to kubectl. Extensions to kubectl are controlled using the krew utility.

You will need to install krew on the same machine you use to run kubectl. For me this is on my k8s-master node.

You can install krew on Ubuntu by running the following:

(
  set -x; cd "$(mktemp -d)" &&
  OS="$(uname | tr '[:upper:]' '[:lower:]')" &&
  ARCH="$(uname -m | sed -e 's/x86_64/amd64/' -e 's/\(arm\)\(64\)\?.*/\1\2/' -e 's/aarch64$/arm64/')" &&
  KREW="krew-${OS}_${ARCH}" &&
  curl -fsSLO "https://github.com/kubernetes-sigs/krew/releases/latest/download/${KREW}.tar.gz" &&
  tar zxvf "${KREW}.tar.gz" &&
  ./"${KREW}" install krew
)

This is copied from the Krew site itself. It is a complicated script but is designed to download and install the correct version for your OS.

Add the install to your path by inserting the following into your .bachrc file:

export PATH="${KREW_ROOT:-$HOME/.krew}/bin:$PATH"

Reload with:

source .bashrc

Check the installation with:

kubectl krew

If it was installed correctly, it should respond with information about krew.

cnpg plugin

You can now install the CloudNativePG plugin with:

kubectl krew install cnpg

This plugin actually creates a manifest file for the CloudNativePG operator, which contains all the various resources the operator needs. We can generate them using:

kubectl cnpg install generate -n pg-operator --replicas 1> pg-operator-manifests.yaml

This creates a manifest file pg-operator-manifests.yml. You can see that I have decided:

To only have 1 replica of the operator. For an HA with sufficient control plane/master nodes, you probably want to have 3. As I only have one master node, I have elected to only have 1.
To place the operator in its own namespace (pg-operator) to make sure that it is not accidentally deleted.

Install the `cnpg` operator

You can consider the operator as an extension of the Kubernetes control plane. As part of the control plane, you would expect the operator to be deployed to the k8s-master node. This can be achieved by adding a toleration to the Deployment resource. A toleration tells the Kubernetes scheduler to prefer a node that matches the toleration rules.

Find the Deployment resource in the pg-operator-manifests.yml file (search for Kind: Deployment). Scroll down to the terminationGracePeriodSeconds: 10 line and then add this in at the same indentation level:

tolerations:
- effect: NoSchedule
  key: node-role.kubernetes.io/master
  operator: Exists
- effect: NoSchedule
  key: node-role.kubernetes.io/control-plane
  operator: Exists

This basically says that it can be scheduled on a node with a taint which prevents scheduling on that node.

Now deploy the manifests and then check the install with:

kubectl apply -f pg-operator-manifests.yaml
kubectl get pods -n pg-operator

You should get a result something like this:

NAME                                       READY   STATUS    RESTARTS   AGE
cnpg-controller-manager-6fbbb565f4-jcdvd   1/1     Running   0          115s

Install Postgres

Now we have the operator installed we can use it to install our Postgres HA cluster.

In order to install the cluster, we need to construct a manifest to tell the operator what type of database configuration we want it to create. You will see a lot of options here but most will default. I have included them here for completeness to show you what is possible.

In this definition we will also create user credentials. These will be held as Kubernetes secrets. Kubernetes holds secrets as key-value pairs and typically, for credentials, two pairs are required with keys of username and password. Note that Kubernetes expects to read keys and values in the form of base-64 encoded strings. This can be achieved with:

echo mypassword | base64

Base-64 encoding is not secure and any base-64 string can be decoded.

You can try this by taking the output of the previous command:

echo bXlwYXNzd29yZAo= | base64 -d

The reason it is used is that it allows any binary data to be stored as a string of letters and numbers, allowing your passwords to include any characters without interfering with the syntax of your YAML file.

The resulting base-64 strings should be used in the following manifest file. In the manifest we will include three resources entered as YAML documents. If you are not familiar with YAML documents they are the sections between two rows of three hyphens:---. Each section or document defines a separate resource and can be defined in its own file. If you go for separate files, you will need to apply each to your cluster in the right order. For this reason, I prefer the single file.

To help explain the file, I will separate them into blocks but all these blocks should be entered as one YAML file called pg_config.yml.

The first two sections allow two users to be created:

The administrator or superuser for the database (must be postgres)
The normal user for the database (app_user)

The superuser account can do anything (like the root user on Linux) and should not be used for normal operations. Instead, you should always use a separate user, with restricted access (app_user) to connect to your cluster.

Superuser

We will now create the credentials of the superuser of the database. Add this document section to your YAML file, replacing the password with something stronger than secret_password (remember to base 64 encode it):

apiVersion: v1
kind: Secret
type: kubernetes.io/basic-auth
data:
  password: c2VjcmV0X3Bhc3N3b3Jk #secret_password
  username: cG9zdGdyZXM= #postgres - Note: It NEEDs to be postgres
metadata:
  name: pg-superuser
  namespace: pg

The comment notes that the username in this case must be postgres as that is what the operator is expecting to find. Of course, the password set here (secret_password) should be a secure password of your choice.

The operator connects to the database as postgres and expectes to find a database called postgres. So do not delete either or your operator will fail.

Kubernetes secrets are held securely but can be accessed by the cluster and also kubectl. This means that it is possible for other people to access your secrets. Other solutions using more secure storage include applications such as Vault. For now, we will use Kubernetes secrets.

Also note that this Secret is created in the pg namespace, which is where our database cluster will sit in this example. You will need to create this:

kubectl create namespace pg

As a superuser, the postgres user should only be used for maintenance and configuration of the database.

Normal User

We will now create another secret. This time it is for the normal user that our application will use to access and use the database (app_user).

Add this document section to your YAML file.

apiVersion: v1
kind: Secret
type: kubernetes.io/basic-auth
data:
  password: c2VjcmV0X3Bhc3N3b3Jk #secret_password
  username: YXBwX3VzZXI= #app_user
metadata:
  name: pg-app-user
  namespace: pg
---

Again, you should use a secure password and it should also be different from your admin password you created above.

You can choose a different name to app_user but you will need to change a reference to it later in the file.

Setting up the database HA cluster

This next document section is quite large as it contains a lot of tuning parameters for your database. Many you might not understand or know what value to enter but help is on its way. Let’s break this down into sub sections.

First we add the database cluster. We do this by using the Cluster Custom Resource Definition or CRD provided by cnpg. This means that CloudNativePG has created a Kubernetes resource and associated CRD that allows a Postgres DB cluster to be created.

Add this to your YAML file:

apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: my-pgsql-cluster
  namespace: pg
spec:
  description: "My example pg cluster"
  imageName: ghcr.io/cloudnative-pg/postgresql:16.1
  instances: 2

You can see we are using the cnpg CRD because of the apiVersion on the first line. We are creating a Cluster, which I am calling my-pgsql-cluster, and adding it to the pg namespace.

In the last line you will see I have 2 instances. One will start as the master and the other as a replica. If the master fails, the operator will automatically select the replica to change to be the master. I have chosen 2 in this case as I have a 3-node cluster with one node providing the control plane for Kubernetes, leaving 2 workers.

We now give the operator information on how to manage our cluster. Add the following to the same document section you started previously.

  startDelay: 30
  stopDelay: 100
  primaryUpdateStrategy: unsupervised

The startDelay is the number of seconds before the cluster starts its liveness probe. This allows time for the database to start. Note that you will need to wait this long after starting your cluster before using it.

The stopDelay is the number of second the operator waits before finally shutting down and deleting the cluster. This allows time for the database to ensure a clean exit.

These delays can seem like a lifetime when waiting for them but have a bit of patience when starting and stopping the database!

The third line indicates how the operator should execute an update. There are two options:

unsupervised — automatically upgrade the master replica after the others are updated and the cluster master has been failed over to a replica by the operator
supervised — the master will not be updated until the master has been manually failed over to a replica

The next document section tunes Postgres to fit your infrastructure. To help with this the pgtune website helps define the values for you. I entered:

DB version: 16
OS type: Linux
DB Type: Web Application
Total Memory (RAM): 1GB
Number of CPUs: 1
Data Storage: SSD

This then gave me my set of tuning values (see here for more information on this), which I added to the postgres.parameters value in my YAML file:

  postgresql:
    parameters:
      max_connections: '200'
      shared_buffers: '256MB'
      effective_cache_size: '768MB'
      maintenance_work_mem: '64MB'
      checkpoint_completion_target: '0.9'
      wal_buffers: '7864kB'
      default_statistics_target: '100'
      random_page_cost: '1.1'
      effective_io_concurrency: '200'
      work_mem: '655kB'
      huge_pages: 'off'
      min_wal_size: '1GB'
      max_wal_size: '4GB'

Note that all values are strings so do not forget to quote them.

Next is the setup of the pg_hba file. This determines what servers and users can connect to the database and how they should connect with a password using the scram-sha-256 method. In our case we will allow any server in our Virtual Private Cloud (VPC) subnet to access the database.

    pg_hba:
    - host all all 10.240.0.0/16 scram-sha-256

Next, we will ask the operator to create a default database for us called my_app_db and assign it to the normal user we defined credentials for earlier (app_user). As the operator will retrieve the credentials the user name must match the unencoded key value:

  bootstrap:
    initdb:
      database: my_app_db
      owner: app_user
      secret:
        name: pg-app-user
      postInitApplicationSQL:
        - create schema my_app

You will see that I have included a postInitApplicationSQL line to create the schema (my_app) for us within the new database. It is useful to do this here as the normal user does not have access to create a schema. If you are using Spring Boot with Hibernate, you will not have to worry about getting the schema to be created with Liquibase or Flyway.

There are other options to creating a new database, including recovering the database from a backup but I will cover those in a future article.

We also have to tell the operator which secret holds the superuser credentials. The operator assumes the admin user is called postgres.

Add the following to your YAML file:

  superuserSecret:
    name: pg-superuser

We are nearly finished. The only remaining thing is to tell the database where to store its data. If you have been following my article, you will have seen how Kubernetes uses Persistent Volumes (PVs) and Persistent Volume Claims (PVCs) to manage storage.

In a further article, I have installed a PV operator to automatically manage the provisioning of PVs based on PVCs and a Network File System (NFS).

I am going to rely on the auto-provisioning of PVs using this operator. This does mean that when you delete the cluster (or the operator), you will not recover the database if you reinstall it, although the files are archived on the NFS server.

Note: that NFS is not a suitable option for large databases that require high, transactional performance. You can read more about this here. For now I am using NFS to simplify the deployment.

Postgres stores its data in its pgdata folder but also requires storage for its Write-Ahead Log (WAL). This will be stored in pgdata by default but the operator allows for it to be stored separately. In this example we will leave it in pgdata.

There are several ways to define the storage. We will use the simplest by defining the PVC itself.

storage:
    size: 10Gi
    storageClass: nfs-client

By setting the storageClassName to nfs-client, the storage operator will automatically create the PV and bind the PVC to it.

If you have stuck everything together, you should have the following:

pg-config.yml

apiVersion: v1
kind: Secret
type: kubernetes.io/basic-auth
metadata:
  name: pg-superuser
  namespace: pg
data:
  password: c2VjcmV0X3Bhc3N3b3Jk #secret_password
  username: cG9zdGdyZXM= #postgres - Note: It NEEDs to be postgres
---
apiVersion: v1
kind: Secret
type: kubernetes.io/basic-auth
metadata:
  name: pg-app-user
  namespace: pg
data:
  password: c2VjcmV0X3Bhc3N3b3Jk #secret_password
  username: YXBwX3VzZXI= #app_user
---
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: my-pgsql-cluster
  namespace: pg
spec:
  description: "My example pg cluster"
  imageName: ghcr.io/cloudnative-pg/postgresql:15.1
  instances: 1

  superuserSecret:
    name: pg-superuser
  enableSuperuserAccess: true

  startDelay: 30
  stopDelay: 100
  primaryUpdateStrategy: unsupervised

  logLevel: debug

  postgresql:
    parameters:
      max_connections: '200'
      shared_buffers: '256MB'
      effective_cache_size: '768MB'
      maintenance_work_mem: '64MB'
      checkpoint_completion_target: '0.9'
      wal_buffers: '7864kB'
      default_statistics_target: '100'
      random_page_cost: '1.1'
      effective_io_concurrency: '200'
      work_mem: '655kB'
      huge_pages: 'off'
      min_wal_size: '1GB'
      max_wal_size: '4GB'

    pg_hba:
    - host all all 10.240.0.0/16 scram-sha-256

  bootstrap:
    initdb:
      database: my_app_db
      owner: app_user
      secret:
        name: pg-app-user
      postInitApplicationSQL:
        - create schema my_app

  storage:
    size: 10Gi
    storageClass: nfs-client

For those of you who are observant, I have added two additional lines in this complete version to show you two useful options.

enableSuperuserAccess: true

The operator creates a superuser account (postgres) that cannot login. This is for the same reason as you do not want logins to your root account on your server — security. However, this can hinder management. By adding this line in you can log in as the postgres superuser and do the maintenance you need to do.

logLevel: debug

This changes the cluster log level to debug for additional information should you be running into problems. Loggkubering volumes will be increased as a result.

Starting the cluster

Now we have our manifest, we can ask the operator to create it with:

kubectl apply -f pg-config.yml

You can then check your deployment with:

kubectl get all -n pg

On my system this is what the result looks like:

NAME                     READY   STATUS    RESTARTS   AGE
pod/my-pgsql-cluster-1   1/1     Running   0          11m

NAME                          TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)    AGE
service/my-pgsql-cluster-r    ClusterIP   10.106.186.218   <none>        5432/TCP   11m
service/my-pgsql-cluster-ro   ClusterIP   10.108.14.6      <none>        5432/TCP   11m
service/my-pgsql-cluster-rw   ClusterIP   10.101.10.176    <none>        5432/TCP   11m

Testing the installation

From the result above, you can see the three services that we talked about earlier:

NAME                   TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)    AGE
...
my-pgsql-cluster-r     ClusterIP   10.103.50.9      <none>        5432/TCP   125m
my-pgsql-cluster-ro    ClusterIP   10.102.247.142   <none>        5432/TCP   125m
my-pgsql-cluster-rw    ClusterIP   10.111.226.192   <none>        5432/TCP   125m
...

You can see how all three Services are ClusterIPs. We need to be able to access one of these to test the installation. For that we need to add a port forward (replace the < > field with your own values):

kubectl port-forward svc/my-pgsql-cluster-rw -n pg --address <node IP address> 5432:5432

If you are following my articles you will have a connection to an OpenVPN. You can now connect a database client (such as DBeaver) to the rw service.

The connection details are: Host: <node IP address>:5432

Host: <node IP address>
Port: 5432
User: app_user
Secret: <app_user secret>

If you connect successfully, you know that your Postgres cluster is up and running.

Just a note about DBeaver. When you connect you give it a database to connect to. If you choose to connect to the postgres database you will only see that database unless you select Show all databases under the PostgreSQL tab of the connection editor. If you don’t do this, you will not see your database.

There is much more to the set up and maintenance of a Postgres database within a Kubernetes cluster but that can wait until future articles.

Summary

In this article we have managed to install a High-Availability Postgres database within a Kubernetes cluster.

We started by installing the Krew plugin manager and we then used this to install the CloudNativePG cnpg plugin to kubectl. This allowed us to create the manifest files to install the CloudNativePG Postgres operator.

After installing the operator we used it to create the database cluster, which included the creation of a database, schema and user, leaving us with a functioning database to use.

If you found this article of interest, please give me a clap as that helps me identify what people find useful and what future articles I should write. If you have any suggestions, please add them in the comments section.