Migrating PVC Data Across On-Prem Kubernetes Clusters with rsync
With the increasing adoption of Kubernetes, there are times when one needs to migrate Persistent Volume Claim (PVC) data from one on-prem cluster to another. Such migrations can be daunting, especially when dealing with large datasets. This article presents a solution using a Kubernetes CronJob and the venerable rsync utility to facilitate this migration.
Use-Case Scenario
Consider two on-prem Kubernetes clusters: ClusterA and ClusterB. You need to move data from a PVC in ClusterA to a PVC in ClusterB. Due to specific requirements, the PVC in ClusterB has been mounted directly on a host.
The Solution
We will create a Kubernetes CronJob in ClusterA that uses the rsync utility. This job will sync the data from ClusterA's PVC to the host-mounted PVC in ClusterB.
Here’s a step-by-step guide:
1. Prepare the Destination:
Ensure that the destination PVC in ClusterB is properly mounted on the host and is accessible via SSH. This is crucial for rsync to transfer data directly. This step will vary according to your environment, but in the rare case that you’re not using a managed storage solution, go over these steps:
Identify the Node and Mount Path of the PVC:
This step will depend on the specific environment, but as a starting point, if you're using kubectl, you can describe a pod that's using the PVC to find out on which node the pod is running and where the PVC is mounted.
kubectl describe pod <POD_NAME> -n <NAMESPACE>In the output, look for:
Node:to find out which node the pod is running on.- Under
Mounts:, you can see the PVC name and the mount path inside the pod.
SSH into the Node:
ssh user@node_IP_or_hostnameNote: Replace user with the appropriate user name for your node and node_IP_or_hostname with the IP or hostname of the node from the previous step.
df -h | grep <MOUNT_PATH_FROM_POD_DESCRIPTION>This command shows mounted filesystems and their details. By using grep, you can filter the results to show only the specific PVC mount path.
Try listing the contents of the mount to ensure you have read access:
ls <MOUNT_PATH_FROM_POD_DESCRIPTION>If you want to test write access, you can try creating a temporary file:
touch <MOUNT_PATH_FROM_POD_DESCRIPTION>/testfile.txtAnd then remove it:
rm <MOUNT_PATH_FROM_POD_DESCRIPTION>/testfile.txtRemember, these are generic steps. Depending on your environment, specifics like SSH access, permissions, or Kubernetes settings might be different.
2. Set Up SSH Key Authentication:
For seamless and secure data transfer, set up SSH key authentication between the source (ClusterA) and the destination host (one of the nodes of ClusterB). Generate a private-public key pair and place the public key on the destination host's authorized_keys file. Keep the private key safe; we'll use it in our CronJob.
3. Verify User Permissions On Both Ends:
Once we verified both the existence of the mounted PVC as well as the SSH authentication process, we may proceed to validate user permissions for accessing the mounted path itself. This is crucial because we’ll run the CronJob’s Pod with a certain user; a user that is configured to access the source PVC. In this demo case, I’m opting for the dummy user with the UID of 9000. Modify & execute these commands to verify sufficient permissions for the correct user:
Explicitly Specify the User and Group:
chown -R 9000:9000 /destination-data; #GIVE OWNER PERMISSION TO USER AND GROUP
chmod u+w /destination-data; #GIVE WRITING PERMISSION TO THE RIGHT USER AND GROUP
ls -ltr /destination-data; #VERIFY THAT YOU GAVE THE RIGHT USER AND GROUP THE RIGHT SET OF PERMISSIONS4. Crafting the Dockerfile:
To facilitate the migration process, we need a Docker image equipped with the rsync utility. In this phase, we will guide you through creating a simple Dockerfile that installs rsync and lays the groundwork for our CronJob. Since we’re implementing this solution in a real-life environment, it’s crucial that we can validate the integrity of the image and keep it lightweight, and thus we’re opting for the implementation of an alpine base image.
# Use a specific version of Alpine Linux as our base image
FROM alpine:3.18.3
# Update the package repository
RUN apk update
# Install openssh-client and rsync
RUN apk add openssh-client
RUN apk add rsync
# Create a new group and user with specified UID and GID
RUN addgroup -S appgroup --gid 9000 && adduser --disabled-password --no-create-home --uid 9000 appuser -G appgroup
# Switch to the newly created user
USER appuserThe key highlights of this Dockerfile:
- Base Image: We're using a specific version,
alpine:3.18.3, to ensure consistency across builds. - Installing Required Packages: We're installing
openssh-clientandrsync. - User & Group Creation: Instead of running our container as root (which is a potential security concern), we're creating a group (
appgroup) with GID 9000 and a user (appuser) with UID 9000. The container will run as this user, ensuring better security. - Switching User: The
USER appuserdirective ensures that when the container is run, the default user will beappuser.
5. The Kubernetes CronJob:
Here’s a pre-configured manifest:
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: data-migrator
namespace: migration-ns
spec:
schedule: "0 0 * * *" # This is set to run daily at midnight.
startingDeadlineSeconds: 20
successfulJobsHistoryLimit: 3
failedJobsHistoryLimit: 1
suspend: false
concurrencyPolicy: Allow
jobTemplate:
spec:
template:
metadata:
labels:
app: data-migrator-job
spec:
containers:
- command:
- /bin/sh
- -c
- rsync -avzp -e "ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -i /key/ssh-privatekey" --log-file /data/logs/rsync_$(date '+%Y%m%d%H%M%S') /source-data admin@192.168.1.100:/destination-data; if [[ $? == 24 || $? == 0 ]];then exit 0;else exit 1;fi
image: rsync-image:v1
imagePullPolicy: Always
name: data-migrator-container
securityContext:
runAsGroup: 9000
runAsUser: 9000
volumeMounts:
- mountPath: /data/
name: source-data
subPath: logs/
- mountPath: /key
name: ssh-key
restartPolicy: OnFailure
terminationGracePeriodSeconds: 5
volumes:
- name: source-data
persistentVolumeClaim:
claimName: source-pvc
- name: ssh-key
secret:
secretName: private-key-secretUnderstanding the Configuration:
- The CronJob named
data-migratoris designed to run in themigration-nsnamespace. It kicks off daily at midnight. - The container within the job uses an image
rsync-image:v1which hasrsyncutility installed. - It pulls the private SSH key from a secret named
private-key-secret. This key is used for SSH authentication. rsyncis set to sync data from a directory/source-data(which maps to a PVC namedsource-pvcinClusterA) to a host (with dummy IP192.168.1.100) where the destination PVC fromClusterBis mounted.
Points to Note
- This approach assumes that there’s network connectivity for
rsyncto access the destination. In on-prem environments, ensure that firewalls or other network configurations don't block the required traffic. - SSH Key-based authentication provides security. However, the practice of ignoring host key checks (
UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no) is used for simplicity but is not recommended for production scenarios due to potential man-in-the-middle attacks. - Ensure proper permissions for both source and destination directories to prevent any permission-related issues during data sync.
- Monitor the
rsynclogs (which the CronJob saves in/data/logs) for any issues or discrepancies in data transfer.
Wrapping Up
Migrating data between PVCs across on-prem Kubernetes clusters doesn’t have to be complex. With tools like rsync and the orchestration capabilities of Kubernetes, even sizable data migrations can be automated efficiently. This article showcased one such solution, and with minor tweaks, it can be adapted to various similar scenarios. Safe migrations!
