Changing Kubernetes PVC storage class AND downsizing them at the same time
About optimising Kubernetes Persistent Volume costs
Ever realised too late that you’ve provisioned an SSD for a pod when you didn’t need it? Or that your drive of 400gb is only 40gb full, and in hindsight, never going to exceed ~60gb?
This blog article is for you!
It shows how to downsize a PVC AND change the storage class at the same time.
Note that this will cause an outage, so prep and time it appropriately. If you’re interested in a way that will let you do this WITHOUT an outage, get in touch. 🙂
Too Long Didn’t Read (TLDR) Version
The TLDR version is:
- Unmount the existing PVC + PV from the pod & cluster
- Create the new disk of the size & type you want
- Mount both the new & the old disk into a compute engine instance
- Copy the data from old to new
- Mount the new disk as a replacement PV + PVC
The “Ok, I need a bit more info than that” version is:
- Take a snapshot — because disaster recovery is important
- Create a new disk of the size and storage class that you want
- Scale down the
Deployment
orStateful Set
that controls the pods that use those PVCs - Mount both the original disk AND the new disk into a VM (yes, outside of the Kubernetes cluster)
- Copy the data from the original drive to the new drive
- Unmount the drives from the VM
- Extract the
PVC
andPV
manifests from your cluster - Carefully modify the
PVC
&PV
manifests to use your new drive - Carefully delete the old
PVC
&PV
resources from your cluster - Apply the new
PVC
&PV
resources that refer to the new drive - you should be able to see thePVC
bind to the new disk - If you’re using a Stateful Set that used a
Volume Claim Template
, you've got some extra steps. You need to modify yourStateful Set's
manifest to use the new disk size AND the new disk storage class. You need to carefully delete the oldStateful Set
resource. You need to reapply the modifiedStateful Set
manifest - you should see the PVC becomebound
- Scale up your
Stateful Set
orDeployment
and it should load properly referring to the new PVC, and therefore the new disk
For more details & pictures, read on!
About drive types on GCP
On GCP, there are 3 “typical” storage classes — although you DO need to define some these in your cluster yourself by specifying same of them ( https://kubernetes.io/docs/concepts/storage/storage-classes/). (The other major cloud providers also have the same, but slightly different in name and costs)
Their prices are as follows:
(Please go check the up-to-date pricing for your region: https://cloud.google.com/compute/disks-image-pricing#disk)
This table shows the monthly costs of a 400gb SSD, and other variations & savings.
As you can see, going from a 400gb SSD to a 400gb Standard drive will go from $92/month to $21.60 — a savings of $70.40
Going from a 400gb SSD to a 100gb SSD will go from $92/month to $23/month — a savings of $69.
i.e. for us, it was a smidgeon cheaper to downgrade our SSD to a standard drive than to reduce the disk size.
Combining BOTH would go from 400gb SSD of $92/month, to 100gb Standard of $5.40 — a savings of $86.6/month…
But you’d lose out on all of those fast speeds you get from SSDs.
“What’s this balanced drive”, I hear you ask.
About Balanced Drives
Balanced drives are interesting. The official documentation ( https://cloud.google.com/compute/docs/disks/performance & https://cloud.google.com/compute/docs/disks/performance#n1_vms) says that balanced drives in low-CPU scenarios (i.e. your VM has less than 16 cores), are as fast as an SSD, AND are about half the price.
This next table shows a comparison of pricing against the balanced drives, as well as the monthly savings.
Taking the above example, a 400gb balanced drive will be $54/month — a savings each month of $38.
Going from a 400gb SSD to a 100gb balanced drive will be $13.5/monthy — a savings every month of $78.50
So for about $10/month more than going to a standard 100gb drive, we get a drive that is — in theory — and only for “low CPU scenarios”, as fast as an SSD, and gives us a $78/month savings.
Pretty sweet.
So I’m going to show you how we took our 400gb SSDs, and reduced them to 100gb balanced drives.
How to reduce the PVC disk size and change the storage type at the same time
(Note, we’re running Kubernetes on Google Cloud Platform and I use a Mac. So you’re gonna see stuff that may or may not be exactly correct for your cloud provider & your OS. Good luck!)
Find the disk, snapshot it, extract its’ resource manifests
kubectl get pvcs -n <insert the relevant namespace here>
To find the disk, you can pop into the kubernetes cluster and the namespace and simply perform a GET
on PVCs
.
This should then get you something like the below picture (Note that I have removed some columns to make it easier to display)
$ kubectl get pvc -n client-foo-prod
NAME STATUS VOLUME CAPACITY STORAGECLASS
# <snip>
data-foo-v2-base-zookeeper-4 Bound pvc-2589c3ae-3f5c-488b-9096-3cc11f9bc520 2Gi standard
data-foo-v2-index-0 Bound pvc-drive-that-is-the-wrong-size-and-class 400Gi ssd
Note on line 5 that the data-foo-v2-index-0
PVC
is bound to a Persistent Voume
( PV
) called pvc-drive-that-is-the-wrong-size-and-class
. This is also the same name for the corresponding disk outside of the kubernetes cluster. It is of storage class ssd
, and has a capacity of 400Gi.
Really Important — Check your reclaim policy!
If your original drive was a dynamically provisioned volume, the underlying disk WILL be deleted when the Persistent Volume Claim
is deleted…
So it is REALLY crucial that you patch the Persistent Volume
BEFORE you do anything else:
$ kubectl patch pv <your-pv-name> -p '{"spec":{"persistentVolumeReclaimPolicy":"Retain"}}'
At this point in time, get BOTH the PVC
and the PV
manifests and pasting them into a text editor. You're gonna need to modify it!
You can easily do this with the following command line commands
$ kubectl get pvc <insert PVC name here> -n <namespace> -o yaml | pbcopy
This will get the PVC
's manifest in yaml
fiormat and then copy it to the pasteboard (i.e. clipboard)
You should then paste it into a text editor
Then do the same for the PV
$ kubectl get pv <insert PV name here> -o yaml | pbcopy
(Note that there is no need to specify the namespace for the Persistent Volume
because those are cluster level resources)
Then paste that into the same text file. (Add the ---
seperator that yaml
uses)
It should look something like that
Save this yaml file for now, you’ll come back to it later.
Note that the PersistentVolume.spec.persistentVolumeReclaimPolicy
should be Reclaim
if you'd followed the instructions from earlier.
Just to reiterate - this is important, or your disk WILL be deleted when the PVC is deleted.
Now to create the snapshot. For something like that, I’m quite happy to use the GCP console.
From your GCP console, go to the Compute Engine, and then to Disks.
Find the disk that you’re after, and click on it, and then click the “Create Snapshot” button. You should then see something like that.
Give it a nice name — I tend to recommend including the date + time of the snapshot in the name.
Something along the lines of
<application-name>-<data type>-YYYY-MM-dd-HHmm
# examples
prod-foo-index-node-index-2023-10-24-1149
prod-nginx-access-logs-2024-02-31-2359
Several notes from a GCP perspective:
After all of this… Then you need to delete the PVC
and the PV
.
In order to delete your PVC
and PV
...
I’m going to write this one last time: Check your Persistent Volume
‘s reclaim policy!
If it is set to Delete
, then when you delete the PVC
– or the PV
– manifest, the underlying disk WILL be deleted too. You do NOT want this to happen! You want it to be set to Reclaim
.
In order to delete your PVC & PV
$ kubectl delete pvc <insert-pvc-name-here> -n <insert namespace here>
# if it was a dynamically provisioned PVC, you will NOT need to delete the matching PV resource, so this next step is unneeded
$ kubectl delete pv <insert-pv-name-here>
Now your disk that was claimed by the Kubernetes cluster is no longer claimed, and can be freely mounted elsewhere.
Create the new disk
Presumably you already know:
- the desired size of the new disk
- the desired drive type
- the file system type (in the above manifest, you can see
.spec.gcePersistentDisk.fsType
the existing disk is of typeext4
In the GCP console, just click the “Create disk” button
Configure the desired disk, and then create it. It is important that your disk is in the same region AND the same zone as the disk that you are trying to replace.
Copy the data using a VM
Now that the original oversized & mis-classed disk is unmounted from the PersistentVolume
, and there is a new disk created that is the right size and the right storage class, it is possible to begin the process of transferring the files.
Go to the VM Instances in Compute Engine
Create a Compute Engine instance using the “Create Instance” button
You’ll be taken to a whole screen of machine configurations to select from.
Key things to note:
- Your compute engine VM must be in the same region & zone as the original disk, and therefore, the same region & zone as the new disk
- This is a temporary instance purely to be used for copying the contents of one disk to another — so pick the cheapest machine type — which is currently
E2
Choose a spot instance because it is cheaper
Choose a small boot disk and a cheap storage class
Then go all the way down to the bottom under “Advanced Options”, and then choose to add the two existing disks
You should be able to pick your unmounted original disk, AND your newly created disk both at this location.
Make sure that the Deletion rule
is set to "Keep disk"...
Then once that’s all sorted, create your instance!
Make sure it is started, and then connect to it via SSH.
Copy the files
First off, confirm that the mounts have been attached to the VM using the lsblk
command.
Looking at lines 7 and 8, you can see the 2 attached drives, and that neither have a MOUNTPOINT
. So they've been attached, but not mounted. Let's do that next. For me, sdc
is the original big drive, and sdb
is the new smaller drive - depending on your configuration, you might need to figure out which is which.
As you can see, line 19 shows that the attached disk sdc
is now mounted to the /mnt/disks/orginal
mount point. Listing the contents now show the contents of the original drive that I want to get the files off of.
Now for the newly created drive. If you’d followed my instructions above and created a completely blank drive, it is not going to be formatted. So the first thing to do, is to format it. Remember when I mentioned earlier you needed to know the file format? And in my case, it’s ext4
. So the target drive also needs to be ext4
.
And again, you can see it now, both the original disk, and the target disk both mounted and ready for copying.
Best way to copy the files? Use rsync
. It gives you a view of progress, and it also can recover in case the VM dies for whatever reason.
Start the rsync
copy and watch as it performs the copying.
Once it’s finished… check that it was successful, and make sure that the copy had worked.
Unmount the disks and prepare it to go back into the cluster
In the Compute Engine UI, stop the VM, edit it, and then unmount the two disks.
Remember those YAML manifests you’d extracted earlier?
Now is the time to edit those and replace them with your own new disk. Here’s what to do:
- Delete all
.status
blocks - In both
.metadata
blocks, delete theannotations
,creationTimestamp
,resourceVersion
, bid - In the
PersistentVolume
manifest, delete the.spec.claimRef.resourceVersion
and the.spec.claimRef.uid
- Find all references to the previous disk name and replace it with the new disk name, size, & storage class (find and replace is your friend).
For the PersistentVolumeClaim
.spec.resources.requests.storage
.spec.storageClassName
.spec.volumeName
For thePersistentVolume
.metadata.name
.spec.capacity.storage
.spec.gcePersistentDrive.pdName
.spec.storageClassName
Then… Save it!
And then apply it!
kubectl apply -f <manifest>.yaml
Now check and see if it is all working
$ kubectl get pvc -n client-foo-prod
NAME STATUS VOLUME CAPACITY STORAGECLASS
# <snip>
data-foo-v2-base-zookeeper-4 Bound pvc-2589c3ae-3f5c-488b-9096-3cc11f9bc520 2Gi standard
data-foo-v2-index-0 Bound pvc-drive-that-is-good 100Gi balanced
You should see your PVC’s status becoming bound
. Once you see that, then you know it's all worked and you're done!
Update your Pod Controllers and Verify that they work
For a Kubernetes Deployment
, it's easy. Volume linkages for Deployments
are via the PVC
name, so once it is scaled up, and the PVC
is found, the deployment will be sorted.
For a Kubernetes Stateful Set
, you've got more work to do...
As you probably know, Stateful Sets tend to use dynamic volume provisioning, so they have a .volumeClaimTemplate
as part of the specification. So if you were to try to scale up the stateful set, it'd be able to find the PVC by name, BUT, it will not match by size, nor by storage class, and then the pods would fail to start.
And since the volumeClaimTemplate
is one of the immutable portions of the stateful set resource, you can't just CHANGE it and have it work. What you need to do is to delete the stateful set, and recreate it with the proper volume claim templates.
Let me know below in the comments if you wanted more info on how to change the StatefulSet
and its immutable volumeClaimTemplate
.
Minor cleanup tasks
Once your pods have spun up, you’ve tested it, and it all works, you can clean up.
Your oversized & overspecc-ed disk is no longer needed, and can be deleted.
Your VM that you used for copying things in can also be deleted.
Even the snapshot you took can be deleted. (Unless you want to keep it for backup purposes, in which case, please keep it.)
Last Words
Kubernetes is SO easy to muck up, and reasonably complex. Thankfully, a lot of the way it has been built also means that there are ways to fix any previous mistakes, some easier and more obvious than others.
I hope that this one has helped you figure out how to change the storage class of a PVC and reduce it’s size at the same time. 🙂
Originally published at https://blog.srcinnovations.com.au on November 23, 2023.