Spring Boot CI/CD on Kubernetes using Terraform, Ansible and GitHub: Part 6

10 min readNov 6, 2023

Part 6: Creating a Persistent Volume and connecting it to a postgreSQL database

This is part of a series of articles that creates a project to implement automated provisioning of cloud infrastructure in order to deploy a Spring Boot application to a Kubernetes cluster using CI/CD. In this part we configure a Persistent Volume to store our database data.

Going to the command line

From this point on, the series focusses on doing things with your Kubernetes cluster. Rather than automating these things, I have decided to go back to manual configuration as it gives us chance to understand more about what is happening beneath the covers. If you wish to automate these steps, you should have a good idea on how to do.

You can find the code for this part here: https://github.com/MartinHodges/Quick-Queue-IaC/tree/part6

Files can be found under the master_files folder.

As mentioned in the previous article, I find it easier to work from the command line on the master node than on my development machine and so all these commands are issued on the master node. You can access your master node with:

ssh -i ~/.ssh/qq_rsa kates@<master ip address>

Persistent Volumes

Kubernetes works by spinning up pods on nodes. Each node can run any number of pods (including none!). Each pod contains one or more containers and so when a pod is spun up (or scheduled in Kubernetes terminology), then the container(s) it represents is started.

Now, if the node, the pod or the container crashes or gets terminated, the Kubernetes scheduler will schedule a new pod to be created. This might be on another node.

The thing is, if you have an application that persists its data to a filesystem (such as a postrgreSQL database), when the pod is rescheduled, it will start with a new container and a clean slate. Your nicely saved data will be lost!

To overcome this, you must create a Persistent Volume (PV). The PV stays around, even when your pod is rescheduled. Your new pod can then use the PV by making a Persistent Volume Claim (PVC) on the PV.

We can demonstrate that with our Spring Boot application and its related database.

First we will create a PV on the master node and then schedule a postgreSQL container with a PVC to connect to that PV.

Now, you might have noticed that we are starting the PV on the master node. You are probably thinking that if the master node is destroyed, so will the PV and you would be right. For this project we will accept that risk. There other options for creating more reliable PVs or you can rely on backups.

Creating a PV

For permanent storage, there are several options with Kubernetes. In this example we are going to set up a Network File System (NFS) on the master node to act as our permanent storage.

Setup NFS on the master node

First we log in to the master node, update the package cache and load the NFS kernel:

sudo apt update
sudo apt install nfs-kernel-server -y

Now we create a folder to share:

sudo mkdir /opt/dynamic-storage
sudo chown -R nobody:nogroup /opt/dynamic-storage
sudo chmod 2770 /opt/dynamic-storage

And allow it to be shared by adding in the following line to the /etc/exports file (using sudo nano), replacing <master private ip subnet> with your VPC subnet details (eg: 10.0.240.0/8):

/opt/dynamic-storage <master private ip subnet>/24(rw,sync,no_subtree_check)

This allows any host on the master’s private subnet to access the NFS folder with read/write access and with all writes to disk being performed synchronously. The no_subtree_check removes the requirement to check the folder structures with each request and enhances performance.

Now enable the shared folder:

sudo exportfs -a
sudo systemctl restart nfs-kernel-server
sudo systemctl status nfs-kernel-server
sudo systemctl enable nfs-kernel-server

The file system is now available to use as a Persistent Volume (PV) through a Persistent Volume Claim (PVC).

Enable NFS access from each node

On each host (master and worker node), enable NFS access with:

sudo apt install nfs-common -y

Installing Helm

When configuring Kubernetes there can be a a lot of manifest (yaml) files to load. It is important to load the right versions and in the right order. Helm assists with this process. It is effectively a package manager for Kubernetes and, like apt allows you to load and install packages for an OS, Helm loads and installs packages on your cluster, such as an NFS PV provisioner, that we will use provide access to our shared file system. The Helm ‘package’ is known as a Helm chart.

You can install Helm on the master node with the following:

curl https://baltocdn.com/helm/signing.asc | gpg --dearmor | sudo tee /usr/share/keyrings/helm.gpg > /dev/null
echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/helm.gpg] https://baltocdn.com/helm/stable/debian/ all main" | sudo tee /etc/apt/sources.list.d/helm-stable-debian.list
sudo apt-get update
sudo apt-get install helm

Check it installed with:

helm version

I get this result:

version.BuildInfo{Version:”v3.13.1", GitCommit:”3547a4b5bf5edb5478ce352e18858d8a552a4110", GitTreeState:”clean”, GoVersion:”go1.20.8"}

Create a PersistentVolume provisioner using Helm

With an NFS, PV and PVCs are managed using a provisioner. A provisioner is a Kubernetes configuration that needs to be loaded. We will load this provisioner using Helm. Like apt you need to identify the library repository before loading the library:

helm repo add nfs-subdir-external-provisioner https://kubernetes-sigs.github.io/nfs-subdir-external-provisioner
helm install -n postgres --create-namespace nfs-subdir-external-provisioner nfs-subdir-external-provisioner/nfs-subdir-external-provisioner --set nfs.server=<master private ip address> --set nfs.path=/opt/dynamic-storage

Note that we install the provisioner under the postgres namespace so we can use it with our postgres databse.

You can check that it loaded correctly with:

kubectl get all -n postgres
kubectl get sc -n postgres

All Pods should be ready (1/1) and the provisioner should show the details of the type of PV to be created.

With the provisioner set up, when a Pod requires a PV, it can create the PVC and the PV and PVC will be created automatically.

Installing PostgreSQL on the Cluster

Now we have set up the PV provisioner, we can now set up a postgres database on the cluster.

First we have to set up the PVC for the database.

Create the PVC to claim the required volumes

On the master node, create the following manifest file as pg-pvc.yml:

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: pg-claim
  namespace: postgres
spec:
  storageClassName: nfs-client
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10G

This creates a PVC with the postgres namespace. The storageClassName identifes the type as an nfs-cient. The ReadWriteOnce option refers to the number of Pods that can utilise this PVC. Without careful configuration, database deployments should be limited to 1 Pod and this option ensures that the PVC is only claimed by one Pod.

The size of the PVC is 10GB.

You can now install this on the cluster. This is known as applying the manifest files and can be done with:

kubectl create -f pg-pvc.yml

The -f says ‘apply this file to the cluster’.

Deleting a PV

NOTE: This will destroy your data and it cannot be recovered.

As the PV is persistent, it will not be destroyed when postgres is deleted. If you need to delete the PV, you can with:

kubectl delete -f pg-pvc.yml

The use of delete means ‘uninstall/unapply from the cluster’.

Installing Postgres

A number of applications (including postgres) can be found as Helm charts on the bitnami.com site.

We first install the bitnami repository:

helm repo add bitnami https://charts.bitnami.com/bitnami
helm repo update

You can check installed Helm charts with:

helm list -A

Create postgres config

Before installing postgres, you should configure it through the creation of a configuration file.

Create pg-config.yml on the master node:

# define default database user, name, and password for PostgreSQL deployment
#auth:
#  enablePostgresUser: true
#  postgresPassword: "StrongPassword"
#  username: "app1"
#  password: "AppPassword"
#  database: "app_db"

# The postgres helm chart deployment will be using PVC pg-claim
primary:
  persistence:
    enabled: true
    existingClaim: "pg-claim"
  initdb:
    scripts:
      sayHello.sh: |
        echo "Starting new DB scripts"
      sayGoodbye.sh: |
        echo "Starting new DB scripts done"

Note that the auth section is commented out. It has been left in here so you can see how you can set up users and databases. We will rely on the defaults (postgres user and postgres database).

Also included as an example is the initdb scripts that are run the first time the database is installed. The installation state is held in the PV so to rerun these, you need to delete and recreate the PV.

Install posgres

Now the PVC and config have been set up, it is now possible to install postgres on the cluster:

kubectl create namespace postgres
helm install postgres -f pg-config.yml bitnami/postgresql --namespace postgres

As the NFS provisioner was created under the postgres namespace it may already exist. If you get an error stating this, you can safely ignore it.

Note that we are installing it under the postgres namespace.

Kubernetes namespaces are a way of segregating resources within the cluster. If you delete a namespace, you will delete all resources defined within it.

After you install postgres, you will get a set of instructions such as the ones shown below:

PostgreSQL can be accessed via port 5432 on the following DNS names from within your cluster:
postgres-postgresql.postgres.svc.cluster.local — Read/Write connection

To get the password for “postgres” run:
export POSTGRES_PASSWORD=$(kubectl get secret --namespace postgres postgres-postgresql -o jsonpath=”{.data.postgres-password}” | base64 -d)

To connect to your database run the following command: kubectl run postgres-postgresql-client — rm — tty -i — restart=’Never’ — namespace postgres — image docker.io/bitnami/postgresql:15.4.0-debian-11-r39 — env=”PGPASSWORD=$POSTGRES_PASSWORD” \ — command — psql — host postgres-postgresql -U app1 -d app_db -p 5432

> NOTE: If you access the container using bash, make sure that you execute “/opt/bitnami/scripts/postgresql/entrypoint.sh /bin/bash” in order to avoid the error “psql: local user with ID 1001} does not exist”

To connect to your database from outside the cluster execute the following commands: kubectl port-forward — namespace postgres svc/postgres-postgresql 5432:5432 & PGPASSWORD=”$POSTGRES_PASSWORD” psql — host 127.0.0.1 -U app1 -d app_db -p 5432

Follow the instructions to obtain the postgres password.

Adding secrets to the deployment

We will now add the postgres credentials as a Kubernetes secret. This can be done as follows.

First export the postgres password to an environment variable as it mentions above:

export POSTGRES_PASSWORD=$(kubectl get secret --namespace postgres postgres-postgresql -o jsonpath="{.data.postgres-password}" | base64 -d)

Now use that to create the secret:

kubectl create secret generic db-user-pass \
--from-literal=username=postgres \
--from-literal=password=$POSTGRES_PASSWORD

You now have a secret (db-user-pass) that holds the credentials of your database user and can be used in your Spring Boot application.

For security you should delete the environment variable with:

unset POSTGRES_PASSWORD

Services

Although we only have one instance of postgres running in the cluster, it could be running on any host node and could have been assigned any address on the internal netwwork. This makes it hard to connect to it.

Kubernetes solves this by creating a Service. A Service is given a fixed internal IP address and routes requests to the service to any available Pod of the required type.

When the postgres database is installed, the Helm charts also set up a service for it. You can see this with:

kubectl get svc -n postgres

This will show two Services. You should use the one with the Internal IP address.

Using this Service, you can connect to the database no matter which node your Pod is running on.

Test your database connection

If you want to test your database from your local machine, you will need to set up two port forwards. The first port forwards from the Service on the internal network to the host machine and the second from the host machine to the development machine.

In this case, the host machine is the master node so we need so set up a port forward from the Service to the host on the master node.

kubectl port-forward --namespace postgres svc/postgres-postgresql 5432:5432 &

This makes the port 5432 available on the master localhost and is connected to the postgres instance.

Now that port needs to be made available on your development machine with the following ssh port forwarding:

ssh -L 54321:localhost:5432 kates@<master public ip address> -i ~/.ssh/qq_rsa

You should now be able to connect a database client (such as DBeaver) to port 54321 locally. I use port 54321 so as not to interfere with any local postgres instance.

You should use the posgres user with the password obtained above.

If you can connect, you have successfully installed a database using a Kubernetes PV. You can see this working by deleting the Pod. Find the database Pod with:

kubectl get pods -n postgres

You should see an output like this:

NAME READY STATUS RESTARTS AGE
nfs-subdir-external-provisioner-6d9f9585f9-m6ndd 1/1 Running 0 11m
postgres-postgresql-0 1/1 Running 0 5m39s

It shows two pods (the PVC provisioner and postgres itself), both ready (1/1).

You can now delete the postgres Pod with:

kubectl delete pod <pod name> -n postgres

If you look at the Pods again (after a short while — wait for it to be Ready 1/1), you will see that, whilst one has been terminated, Kubernetes has automatically scheduled a new one in its place. Once the new pod starts, you will be able to access the database as if nothing happened (note that you may have to restart the port forwards as the port closes when the pod is deleted).

Summary

In this article we created a Persistent Volume (PV) using Network File Sharing (NFS) on the master node. We used a PV provisioner to create the PV and the Persistent Volume Claim (PVC).

Once we created our PV and PVC, we then used them to install postgres using a Helm chart. We then used port forwarding to access the internal service from a database client on our development machine.

Next we will create a Spring Boot application that can be deployed to the cluster and that can access the database we just created.

Series Introduction

Previous — Creating a Kubernetes Cluster

Next — Adding a Spring Boot Application to Your Cluster