Learn how to recreate an existing PVC in a new namespace, reusing the same PV with no data losses

Raphael Moraes
Jan 13 · 9 min read
Image for post
Image for post

Who works as DevOps Engineer knows that we have many challenges almost all the time and sometimes dealing with the limitations of the technologies we are working on, leaving us to work hard to overcome the obstacles to delivering the tasks. Besides that, depending on the problem we are facing, it can frustrate us because the time spent to solve a problem may keep the ticket in progress for a long time and even impacting the sprint or blocking another task.

I’m here to share an experience that I had during a task, which I have worked on some weeks ago about an existing limitation that I faced around PersistentVolumes in Kubernetes. But before continuing, I’m going to explain some concepts about Persistent Volumes and Persistent Volumes Claims, and how it works in Kubernetes for a better understanding.

Introduction

Image for post
Image for post

Working with stateless web applications containerized over a container orchestration solution is relatively easy. Still, when you need to work with a stateful application or a Cloud-Native storage solution like Cassandra or MongoDB, it involves some manual or imperative steps to set up a reliable and replicated solution.

A storage service is currently an externalized cloud service that can never really exist inside the Kubernetes Cluster. So, working with EKS (Kubernetes managed service provided by Amazon) over the AWS Infrastructure, the Persistent Volumes will never be backed by locally-attached storage on a worker node but by a networked storage system such as EBS or EFS.

The PersistentVolume subsystem abstracts details of how storage is provided and how it is consumed through the two API Resources below:

  • PersistentVolume
  • PersistentVolumeClaim

PersistentVolume (PV): is a piece of storage in the cluster that can be provisioned manually by an administrator or dynamically through the StorageClass resource.

PersistentVolumeClaim (PVC) is a request for storage by a user. It is similar to a Pod. Pods consume node resources, and PVCs consume PV resources.

In the lifecycle context of a Volume and Claim, there are two ways in which PVs may be provisioned:

  • statically
  • dynamically

Static: We define both PV and PVC manifests and apply them via kubectl tool.

Dynamic: The cluster may try to provision a volume, especially for the PVC dynamically, and for it works fine, we need another important object called StorageClass, which is the heart of dynamic provisioning, so this object is essential.

Talking a little bit more about StorageClass is vital because each StorageClass has a provisioner determining what volume plugin is used for provisioning PVs. To get more details about StorageClass at the official Kubernetes documentation, click here.

Each PV gets its own set of access modes describing that specific PV’s capabilities, as you can see below:

  • ReadWriteOnce
  • ReadOnlyMany
  • ReadWriteMany

NOTE: The Claims use the same conventions as volumes when requesting storage with specific access modes.

And to finish the introduction, there is two other important information which you should know about Persistent Volumes, which are:

  • Reclaim Policy
  • Phase

Reclaim Policy: Used to tell the cluster what to do with the volume after releasing its claim. Current reclaim policies are:

  • Retain — manual reclamation
  • Recycle — basic scrub (rm -rf /thevolume/*)
  • Delete — associated storage asset such as AWS EBS, GCE PD, Azure Disk, or OpenStack Cinder volume is deleted

NOTE: Extremely recommended to use Retain policy for PVCs that store critical data.

Phase: A volume will be in one of the following stages:

  • Available — a free resource that is not yet bound to a claim
  • Bound — the volume is bound to a claim
  • Released — the claim has been deleted, but the cluster does not yet reclaim the resource.
  • Failed — the volume has failed its automatic reclamation.

NOTE: The CLI will show the name of the PVC bound to the PV.

Use Case

Image for post
Image for post

I was working on a ticket to implement a database on the EKS cluster where I spent a couple of time to restore the data (3TB) for each one of the three replicas I configured to the StatefulSet, totaling 9TB of data copied. The data transferring took more than 8 hours to complete, and after the completion, I got to know I should have implemented that StatefulSet in another Namespace. Here, the challenge began. Why? Keep on reading to a better understanding.

Who works with Kubernetes knows that some objects are namespace-scoped, and others are cluster-scoped. Considering the PVC is a namespace-scoped object, its not possible to move the PVC to another namespace, neither to deploy the StatefulSet in Namespace A and the PVC in the namespace B. In other words, the Kubernetes does not allow the POD to use a PVC in a different Namespace from wherever it finds itself. The POD and the PVC which it is using needs to be in the same namespace.

This limitation gives me the impression that all the work carried out would be lost. Thanks to the experience I had about Persistent Volumes, I found a way to save those hours of work to copy the data.

Considering what I have just explained, the next steps will depict how to recreate an existing PVC in a new Namespace but still using the same PV.

Main Goal

Image for post
Image for post

Delete the PVCs below, which are in the namespace “staging” and recreate them reusing the same PV but in the namespace “integration”:

PVs to be reused:

Steps to be followed

Image for post
Image for post

Step 1. Patch the PVs to set the “persistentVolumeReclaimPolicy” to “Retain”:

NOTE: Considering the PV is a cluster-scoped object, so you don’t need to pass the namespace as a parameter.

Step 2. Export the current PVCs objects because it will be necessary to recreate the PVCs in a later stage:

Step 3. Delete the current PVC in the namespace “staging”:

See that the PV Status will be changed from “Bound” to “Released”:

But see that in the output above, we still have references from the old Namespace at the CLAIM column.

So, in the next step, I’m going to edit the PVs to remove those references deleting the following lines from them:

Step 4. Edit each one of the PVs to remove the old references:

Edit the PV 1:

And delete all the lines below:

Edit PV 2:

And delete all the lines below:

Edit PV 3:

And delete all the lines below:

Now, see that there are no references anymore in the CLAIM column:

Step 5. Edit each one of those files which we exported the PVC object in step 2.

Edit file 1:

And delete all the lines below with the arrow key “< — — “:

Repeat this step for those files below:

Step 6. Now, I’m going to create those PVCs again, but inside the Namespace “integration”:

In the following, check the PVCs state, which will probably in Pending state:

After a few seconds, all the PVCs should be in a Bound state:

Conclusion

Like any other existing technology, Kubernetes also has some limitations. In this article, I presented an example of the restriction we can face in Kubernetes, specifically around the persistent volumes.

Thank you for reading this article!

wearewebera

See what happens when technology gets out of the way

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store