Backup Kubernetes using Velero and CSI volume snapshot
Velero is an open-source backup solution for Kubernetes, maintained by VMWare-Tanzu.
Velero can be integrated with Restic or Kopia for filesystem backup of pod volumes. Since data in live system keep changing during backup, Filesystem backup of a volume with large size will not have data consistency. To solve this problem, We need point-in-time snapshot of volumes.
Since Kubernetes version 1.13, CSI storage support in Kubernetes is available as stable version. With CSI storage support, Kubernetes v1.20 and newer now have stable support for Volume snapshots and restore.
For details on CSI volume snapshots and restore, Please check Point in time Snapshot of Persistent Volume Data with Kubernetes’ Volume Snapshots.
Velero backs-up Kubernetes resources and PV’s data into a backup storage location like AWS S3 bucket, or any cloud object storage. It uses restic or Kopia (Starting from velero 1.10) tool to upload data inside a persistent volume to object storage.
Starting from Velero version 1.9, It now have stable supports for integration with CSI snapshot which enables creation of volume snapshot of Persistent Volumes.
CSI Volume snapshots are point-in-time copy of Persistent Volumes which has more consistent data than a filesystem backup.
Velero supports two ways to backup Kubernetes resources and volume data using CSI snapshot.
- Backup Kubernetes resources to Object storage AND Create CSI snapshot of Persistent Volumes
- Backup Kubernetes resources to Object storage AND Create CSI snapshot of Persistent Volumes AND Upload content from CSI snapshots to Object storage
Contents
- Why it works on my machine?
- Pre-requisites
- Deploy Volume Snapshot Class
- Install Velero with CSI feature enabled
- Create Kubernetes backups using velero with CSI snapshot
- Restore from backups
Why it works on my machine?
I am using:
- K8s cluster v1.27
- One control plane and two worker nodes on Ubuntu 22.04
- Containerd CRI version 1.6.22
- csi-driver-nfs for persistent volumes, But any CSI supported storage should work
- Kubectl v1.27
- Ubuntu 23.10 as Jumpbox to connect to cluster
- Velero v1.12.1
Pre-requisites:
1. External Snapshotter: We should have external-snapshotter
deployed on our cluster.
Many csi-storage providers deploys external-snapshotter
along with their csi-drivers
on cluster, some may have it optional and deployed only when you explicitly ask for it. Please check the documentation of your storage provider for the same.
2. VolumeSnapshotClass CRD: Volume snapshot class is similar to storage class but for volume snapshots. Snapshotter uses volume snapshot class to get details about csi driver and its parameters to create volume snapshots. Some Kubernetes distributors may already have it on cluster.
To check and create volume snapshot class CRD, Run below command:
# Check if volume snapshot class CRD exists
kubectl get crd volumesnapshotclasses.snapshot.storage.k8s.io
# If it doesnot exist, Use below command to create volume snapshot class CRD
kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/release-6.3/client/config/crd/snapshot.storage.k8s.io_volumesnapshotclasses.yaml
Deploy Volume Snapshot Class
NOTE: All credentials, buckets and clusters in screenshots used in this demo are created for demo purpose only and are removed after the demo.
To provision a dynamic PVC, We need a storage class. Similarly, To create a CSI-snapshot of a PVC we need Volume Snapshot Class.
Just like StorageClass provides a way for administrators to describe the “classes” of storage they offer when provisioning a volume, VolumeSnapshotClass provides a way to describe the “classes” of storage when provisioning a volume snapshot.
If you want to restore the data back to same storage provider in which PVCs where created, provisioner of Storage class and Driver of Volume Snapshot class should be same with same parameters.
In my demo environment, PVCs are created using below storage class:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: nfs-csi
annotations:
storageclass.kubernetes.io/is-default-class: "true"
provisioner: nfs.csi.k8s.io # <--------------------------- Provisioner
parameters:
server: master-node # <--------------------------- NFS server
share: /var/nfs/k8s_pvs # <--------------------------- Share Path
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
mountOptions:
- nfsvers=4.1
So, I will deploy Volume Snapshot class with similar parameters. You must deploy it as per your storage class.
We also need to add label velero.io/csi-volumesnapshot-class=true
in volume snapshot class to make this snapshot class default for volume snapshot created by velero.
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata:
name: nfs-csi
labels:
velero.io/csi-volumesnapshot-class: "true" # <----------- Label
driver: nfs.csi.k8s.io # <------------------------------ Driver
parameters:
server: master-node # <------------------------------ NFS Server
share: /var/nfs/k8s_pvs # <------------------------------ Share Path
deletionPolicy: Delete
Install Velero with CSI feature enabled
For my demo, I will be using AWS S3-compatible bucket (minio) to store the backups.
You may use AWS S3 bucket (Or any AWS S3-compatible storage), GCP Cloud Storage bucket or Azure Blob storage for backup storage. Below demo is performed considering AWS S3 bucket as storage location.
- Create a bucket in cloud provider and provide read, write and delete object permission for your account (or a service account) to it. I have created bucket named
velero-linuxshots
2. Generate credential key for the account which has permission to the bucket. In case of AWS generate, Access Key and Secret Access key for the account and keep it handy and secure.
3. Create a file cloud-credential
on your jumpbox which contains credential generated in previous step.
# cat cloud-credential
[default]
aws_access_key_id=<AWS-ACCESS-KEY>
aws_secret_access_key=<AWS-SECRET-ACCESS-KEY>
region=<REGION>
4. Create a file values.yaml
on jumpbox and add below values. Replace the values as per your environment
configuration:
uploaderType: kopia # <----------------------------- Velero uses Kopia to upload data from volume snapshot
backupStorageLocation:
- bucket: velero-linuxshots # <------------------- Name of bucket
defaultVolumesToFsBackup: False
provider: aws
config:
region: home-server
# s3ForcePathStyle: true # <------------------ Only needed when S3-compatible storage is used instead of AWS S3 storage
# s3Url: https://s3-compatible-storage.url # <- Only needed when S3-compatible storage is used instead of AWS S3 storage
volumeSnapshotLocation:
- provider: aws
config:
region: home-server
initContainers:
- name: velero-plugin-for-csi # <------------------ Velero plugin for CSI snapshot
image: velero/velero-plugin-for-csi:v0.6.0
imagePullPolicy: IfNotPresent
volumeMounts:
- mountPath: /target
name: plugins
- name: velero-plugin-for-aws # <------------------ Velero plugin for AWS S3 bucket
image: velero/velero-plugin-for-aws:v1.8.0
imagePullPolicy: IfNotPresent
volumeMounts:
- mountPath: /target
name: plugins
deployNodeAgent: true
features: EnableCSI # <------------------------------- To enable CSI snapshot feature
5. Add velero helm repository
helm repo add vmware-tanzu https://vmware-tanzu.github.io/helm-charts
helm repo update
6. Install velero server on cluster
helm upgrade --install velero vmware-tanzu/velero \
--create-namespace --namespace velero \
--set-file credentials.secretContents.cloud=/path/to/cloud-credential \
-f /path/to/values.yaml \
--version 5.1.3
7. Check the deployment
helm -n velero ls
kubectl -n velero get po
kubectl get crd | grep velero
Check Velero pods and CRDs
Velero deploys Velero deployment and node agent daemonset. There should be one velero pod and one node-agent pods on each node.
8. Install velero client on jumpbox
curl -LO https://github.com/vmware-tanzu/velero/releases/download/v1.12.1/velero-v1.12.1-linux-amd64.tar.gz
tar -xvzf velero-v1.12.1-linux-amd64.tar.gz
sudo mv velero-v1.12.1-linux-amd64/velero /usr/bin/velero
velero version
Create Kubernetes backups using velero with CSI snapshot
There are two ways Velero can create backup with CSI snapshot
- Without moving snapshot data to backup storage: Velero backs-up K8s resources to backup storage but do not move data to backup storage. Volume snapshot is created but data is kept in snapshot itself. Some storage provider keeps PV and Volume snapshot in same storage. So, If storage itself is lost, PV along with Snapshot data data will be lost and cannot be restored.
- With moving snapshot data to backup storage: Velero backs-up K8s resources along with snapshot data to backup storage. A volume snapshot will be created for the PVs and the data from snapshot is moved to backup storage using Kopia.
Lets try both backup options in our demo:
Before we create backup, Lets create some test files in pod volume mounted on containers.
1. Create backup without moving snapshot data to backup storage
velero create backup <backup-name>
2. Once backup is completed, Describe the backup
velero describe backup <backup-name> --details
You will see volume snapshot details in description of backup
3. Check volume snapshot in cluster. A snapshot for each volume will be present in cluster.
kubectl get volumesnapshotcontent
4. Create backup with moving snapshot data to backup storage
velero create backup <backup-name> --snapshot-move-data
5. After backup is completed, Describe the backup. You will find details of snapshot data being uploaded to backup storage in backup item operation
section.
velero describe backup <backup-name> --details
6. Details of data upload can be checked from datauploads
CR.
kubectl -n velero get datauploads
kubectl -n velero get datauploads <data-uploads-name> -o yaml
7. You can check the S3 bucket. In bucket, You will find a backup
folder where Kubernetes resources are backed-up and stored. There will be one more directory kopia
which stores the encrypted snapshot data.
Restore from backups
Backup is of no use if it cannot be restored during the time of need. Lets test if we are able to restore the resources and data using both the created backups.
I have a sample application deployed on cluster. We also have some test files in pod volumes mounted on container.
- Lets delete the application deployed.
Sample application should not be accessible now.
2. Now, Lets restore it from first backup we created
velero create restore <restore-name> --from-backup <source-backup-name>
3. Once restore is completed, Lets verify if our files we created is entact.
Files are restored well.
Our sample application is also back.
4. Repeat the same step from 1 to 3 to test restore using second backup.
You should be able to restore using second backup as well.
I hope you must have learn something from this article. For more detail on CSI support on Velero, Checkout their document here: https://velero.io/docs/v1.12/csi/
You can also support my work by buying me a cup of coffee on https://www.buymeacoffee.com/linuxshots
Thanks
Navratan Lal Gupta
Linux Shots