Changing Kubernetes PVC storage class AND downsizing them at the same time

About optimising Kubernetes Persistent Volume costs

Jonathan Tan
SRC Innovations
12 min readNov 23, 2023

--

Ever realised too late that you’ve provisioned an SSD for a pod when you didn’t need it? Or that your drive of 400gb is only 40gb full, and in hindsight, never going to exceed ~60gb?

This blog article is for you!

It shows how to downsize a PVC AND change the storage class at the same time.

Note that this will cause an outage, so prep and time it appropriately. If you’re interested in a way that will let you do this WITHOUT an outage, get in touch. 🙂

Too Long Didn’t Read (TLDR) Version

The TLDR version is:

  1. Unmount the existing PVC + PV from the pod & cluster
  2. Create the new disk of the size & type you want
  3. Mount both the new & the old disk into a compute engine instance
  4. Copy the data from old to new
  5. Mount the new disk as a replacement PV + PVC

The “Ok, I need a bit more info than that” version is:

  1. Take a snapshot — because disaster recovery is important
  2. Create a new disk of the size and storage class that you want
  3. Scale down the Deployment or Stateful Set that controls the pods that use those PVCs
  4. Mount both the original disk AND the new disk into a VM (yes, outside of the Kubernetes cluster)
  5. Copy the data from the original drive to the new drive
  6. Unmount the drives from the VM
  7. Extract the PVC and PV manifests from your cluster
  8. Carefully modify the PVC & PV manifests to use your new drive
  9. Carefully delete the old PVC & PV resources from your cluster
  10. Apply the new PVC & PV resources that refer to the new drive - you should be able to see the PVC bind to the new disk
  11. If you’re using a Stateful Set that used a Volume Claim Template, you've got some extra steps. You need to modify your Stateful Set's manifest to use the new disk size AND the new disk storage class. You need to carefully delete the old Stateful Set resource. You need to reapply the modified Stateful Set manifest - you should see the PVC become bound
  12. Scale up your Stateful Set or Deployment and it should load properly referring to the new PVC, and therefore the new disk

For more details & pictures, read on!

About drive types on GCP

On GCP, there are 3 “typical” storage classes — although you DO need to define some these in your cluster yourself by specifying same of them ( https://kubernetes.io/docs/concepts/storage/storage-classes/). (The other major cloud providers also have the same, but slightly different in name and costs)

Their prices are as follows:

(Please go check the up-to-date pricing for your region: https://cloud.google.com/compute/disks-image-pricing#disk)

This table shows the monthly costs of a 400gb SSD, and other variations & savings.

As you can see, going from a 400gb SSD to a 400gb Standard drive will go from $92/month to $21.60 — a savings of $70.40

Going from a 400gb SSD to a 100gb SSD will go from $92/month to $23/month — a savings of $69.

i.e. for us, it was a smidgeon cheaper to downgrade our SSD to a standard drive than to reduce the disk size.

Combining BOTH would go from 400gb SSD of $92/month, to 100gb Standard of $5.40 — a savings of $86.6/month…

But you’d lose out on all of those fast speeds you get from SSDs.

“What’s this balanced drive”, I hear you ask.

About Balanced Drives

Balanced drives are interesting. The official documentation ( https://cloud.google.com/compute/docs/disks/performance & https://cloud.google.com/compute/docs/disks/performance#n1_vms) says that balanced drives in low-CPU scenarios (i.e. your VM has less than 16 cores), are as fast as an SSD, AND are about half the price.

This next table shows a comparison of pricing against the balanced drives, as well as the monthly savings.

Taking the above example, a 400gb balanced drive will be $54/month — a savings each month of $38.

Going from a 400gb SSD to a 100gb balanced drive will be $13.5/monthy — a savings every month of $78.50

So for about $10/month more than going to a standard 100gb drive, we get a drive that is — in theory — and only for “low CPU scenarios”, as fast as an SSD, and gives us a $78/month savings.

Pretty sweet.

So I’m going to show you how we took our 400gb SSDs, and reduced them to 100gb balanced drives.

How to reduce the PVC disk size and change the storage type at the same time

(Note, we’re running Kubernetes on Google Cloud Platform and I use a Mac. So you’re gonna see stuff that may or may not be exactly correct for your cloud provider & your OS. Good luck!)

Find the disk, snapshot it, extract its’ resource manifests

kubectl get pvcs -n <insert the relevant namespace here>

To find the disk, you can pop into the kubernetes cluster and the namespace and simply perform a GET on PVCs.

This should then get you something like the below picture (Note that I have removed some columns to make it easier to display)

$ kubectl get pvc -n client-foo-prod
NAME STATUS VOLUME CAPACITY STORAGECLASS
# <snip>
data-foo-v2-base-zookeeper-4 Bound pvc-2589c3ae-3f5c-488b-9096-3cc11f9bc520 2Gi standard
data-foo-v2-index-0 Bound pvc-drive-that-is-the-wrong-size-and-class 400Gi ssd

Note on line 5 that the data-foo-v2-index-0 PVC is bound to a Persistent Voume ( PV) called pvc-drive-that-is-the-wrong-size-and-class. This is also the same name for the corresponding disk outside of the kubernetes cluster. It is of storage class ssd, and has a capacity of 400Gi.

Really Important — Check your reclaim policy!

If your original drive was a dynamically provisioned volume, the underlying disk WILL be deleted when the Persistent Volume Claim is deleted…

So it is REALLY crucial that you patch the Persistent Volume BEFORE you do anything else:

$ kubectl patch pv <your-pv-name> -p '{"spec":{"persistentVolumeReclaimPolicy":"Retain"}}'

At this point in time, get BOTH the PVC and the PV manifests and pasting them into a text editor. You're gonna need to modify it!

You can easily do this with the following command line commands

$ kubectl get pvc <insert PVC name here> -n <namespace> -o yaml | pbcopy

This will get the PVC's manifest in yaml fiormat and then copy it to the pasteboard (i.e. clipboard)

You should then paste it into a text editor

Then do the same for the PV

$ kubectl get pv <insert PV name here> -o yaml | pbcopy

(Note that there is no need to specify the namespace for the Persistent Volume because those are cluster level resources)

Then paste that into the same text file. (Add the --- seperator that yaml uses)

It should look something like that

Save this yaml file for now, you’ll come back to it later.

Note that the PersistentVolume.spec.persistentVolumeReclaimPolicy should be Reclaim if you'd followed the instructions from earlier.

Just to reiterate - this is important, or your disk WILL be deleted when the PVC is deleted.

Now to create the snapshot. For something like that, I’m quite happy to use the GCP console.

From your GCP console, go to the Compute Engine, and then to Disks.

Select “Disks” from the GCP Console -> Compute Engine section

Find the disk that you’re after, and click on it, and then click the “Create Snapshot” button. You should then see something like that.

Snapshot your existing disk

Give it a nice name — I tend to recommend including the date + time of the snapshot in the name.

Something along the lines of

<application-name>-<data type>-YYYY-MM-dd-HHmm
# examples
prod-foo-index-node-index-2023-10-24-1149
prod-nginx-access-logs-2024-02-31-2359

Several notes from a GCP perspective:

After all of this… Then you need to delete the PVC and the PV.

In order to delete your PVC and PV ...

I’m going to write this one last time: Check your Persistent Volume‘s reclaim policy!

If it is set to Delete, then when you delete the PVC – or the PV – manifest, the underlying disk WILL be deleted too. You do NOT want this to happen! You want it to be set to Reclaim.

In order to delete your PVC & PV

$ kubectl delete pvc <insert-pvc-name-here> -n <insert namespace here>
# if it was a dynamically provisioned PVC, you will NOT need to delete the matching PV resource, so this next step is unneeded
$ kubectl delete pv <insert-pv-name-here>

Now your disk that was claimed by the Kubernetes cluster is no longer claimed, and can be freely mounted elsewhere.

Create the new disk

Presumably you already know:

  • the desired size of the new disk
  • the desired drive type
  • the file system type (in the above manifest, you can see .spec.gcePersistentDisk.fsType the existing disk is of type ext4

In the GCP console, just click the “Create disk” button

Create a new disk through the GCP console

Configure the desired disk, and then create it. It is important that your disk is in the same region AND the same zone as the disk that you are trying to replace.

Example of the new disk — ensure that it is in the correct region & zone as the original disk

Copy the data using a VM

Now that the original oversized & mis-classed disk is unmounted from the PersistentVolume, and there is a new disk created that is the right size and the right storage class, it is possible to begin the process of transferring the files.

Go to the VM Instances in Compute Engine

Go to the VM Instances section in the GCP Console -> Compute Engine…

Create a Compute Engine instance using the “Create Instance” button

…and create a new VM

You’ll be taken to a whole screen of machine configurations to select from.

Pick a cheap machine for copying stuff around

Key things to note:

  • Your compute engine VM must be in the same region & zone as the original disk, and therefore, the same region & zone as the new disk
  • This is a temporary instance purely to be used for copying the contents of one disk to another — so pick the cheapest machine type — which is currently E2

Choose a spot instance because it is cheaper

Spot instances are nice and cheap

Choose a small boot disk and a cheap storage class

Make sure you’re using a cheap and small boot disk

Then go all the way down to the bottom under “Advanced Options”, and then choose to add the two existing disks

Attach the old oversized disk, as well as the new right-sized disk

You should be able to pick your unmounted original disk, AND your newly created disk both at this location.

Make sure that the Deletion rule is set to "Keep disk"...

Then once that’s all sorted, create your instance!

Make sure it is started, and then connect to it via SSH.

Once the VM is created, connect to it!

Copy the files

First off, confirm that the mounts have been attached to the VM using the lsblk command.

Looking at lines 7 and 8, you can see the 2 attached drives, and that neither have a MOUNTPOINT. So they've been attached, but not mounted. Let's do that next. For me, sdc is the original big drive, and sdb is the new smaller drive - depending on your configuration, you might need to figure out which is which.

As you can see, line 19 shows that the attached disk sdc is now mounted to the /mnt/disks/orginal mount point. Listing the contents now show the contents of the original drive that I want to get the files off of.

Now for the newly created drive. If you’d followed my instructions above and created a completely blank drive, it is not going to be formatted. So the first thing to do, is to format it. Remember when I mentioned earlier you needed to know the file format? And in my case, it’s ext4. So the target drive also needs to be ext4.

And again, you can see it now, both the original disk, and the target disk both mounted and ready for copying.

Best way to copy the files? Use rsync. It gives you a view of progress, and it also can recover in case the VM dies for whatever reason.

Start the rsync copy and watch as it performs the copying.

Once it’s finished… check that it was successful, and make sure that the copy had worked.

Unmount the disks and prepare it to go back into the cluster

In the Compute Engine UI, stop the VM, edit it, and then unmount the two disks.

Stop and edit the VM to remove the disks

Remember those YAML manifests you’d extracted earlier?

Now is the time to edit those and replace them with your own new disk. Here’s what to do:

  • Delete all .status blocks
  • In both .metadata blocks, delete the annotations , creationTimestamp , resourceVersion , bid
  • In the PersistentVolume manifest, delete the .spec.claimRef.resourceVersion and the .spec.claimRef.uid
  • Find all references to the previous disk name and replace it with the new disk name, size, & storage class (find and replace is your friend).

For the PersistentVolumeClaim

  • .spec.resources.requests.storage
  • .spec.storageClassName
  • .spec.volumeName

For thePersistentVolume

  • .metadata.name
  • .spec.capacity.storage
  • .spec.gcePersistentDrive.pdName
  • .spec.storageClassName

Then… Save it!

And then apply it!

kubectl apply -f <manifest>.yaml

Now check and see if it is all working

$ kubectl get pvc -n client-foo-prod
NAME STATUS VOLUME CAPACITY STORAGECLASS
# <snip>
data-foo-v2-base-zookeeper-4 Bound pvc-2589c3ae-3f5c-488b-9096-3cc11f9bc520 2Gi standard
data-foo-v2-index-0 Bound pvc-drive-that-is-good 100Gi balanced

You should see your PVC’s status becoming bound. Once you see that, then you know it's all worked and you're done!

Update your Pod Controllers and Verify that they work

For a Kubernetes Deployment, it's easy. Volume linkages for Deployments are via the PVC name, so once it is scaled up, and the PVC is found, the deployment will be sorted.

For a Kubernetes Stateful Set, you've got more work to do...

As you probably know, Stateful Sets tend to use dynamic volume provisioning, so they have a .volumeClaimTemplate as part of the specification. So if you were to try to scale up the stateful set, it'd be able to find the PVC by name, BUT, it will not match by size, nor by storage class, and then the pods would fail to start.

And since the volumeClaimTemplate is one of the immutable portions of the stateful set resource, you can't just CHANGE it and have it work. What you need to do is to delete the stateful set, and recreate it with the proper volume claim templates.

Let me know below in the comments if you wanted more info on how to change the StatefulSet and its immutable volumeClaimTemplate .

Minor cleanup tasks

Once your pods have spun up, you’ve tested it, and it all works, you can clean up.

Your oversized & overspecc-ed disk is no longer needed, and can be deleted.

Your VM that you used for copying things in can also be deleted.

Even the snapshot you took can be deleted. (Unless you want to keep it for backup purposes, in which case, please keep it.)

Last Words

Kubernetes is SO easy to muck up, and reasonably complex. Thankfully, a lot of the way it has been built also means that there are ways to fix any previous mistakes, some easier and more obvious than others.

I hope that this one has helped you figure out how to change the storage class of a PVC and reduce it’s size at the same time. 🙂

Originally published at https://blog.srcinnovations.com.au on November 23, 2023.

--

--