Kubernetes Storage: PV, PVC and Storage Class

Samuel Kadima
9 min readNov 3, 2023

--

In the ever-evolving landscape of container orchestration, Kubernetes stands as the undisputed leader, providing developers and operations teams with a powerful platform for managing, scaling, and deploying containerized applications. However, as the complexity of applications running on Kubernetes continues to grow, so does the need for efficient and reliable storage solutions.

Persistent Volumes (PV), Persistent Volume Claims (PVC), and Storage Classes are the cornerstones of Kubernetes storage management. In this article, we look at these fundamental components that underpin Kubernetes storage, empowering you to make informed decisions and maximize the potential of your containerized applications.

Kubernetes storage is not merely an afterthought; it’s a critical aspect of your infrastructure that can impact performance, reliability, and scalability. This article serves as your guide to demystify these concepts, showcasing their importance in the world of container orchestration and how they work together to ensure your applications have access to the right storage resources at the right time.

Why should you use a PV, PVC or a Storage Class?

In Kubernetes, Persistent Volumes (PV), Persistent Volume Claims (PVC), and Storage Classes are essential components that work together to address various storage needs and provide a robust storage management solution.

Here’s why we use each of these components:

Persistent Volumes (PV)

Persistent Volumes are a way to abstract and represent physical or networked storage resources in a cluster. They serve as the “backend” storage configuration in a Kubernetes cluster. PVs are crucial for the following reasons:

  1. Resource Abstraction

PVs abstract the underlying storage, making it easier to manage storage resources independently of the applications that use them.

2. Resource Management

PVs enable administrators to allocate and manage storage resources, allowing for better utilization and optimization of storage hardware.

3. Data Persistence

PVs ensure data persistence, even if pods or containers are recreated or rescheduled. This is vital for stateful applications and databases.

4. Access Control

PVs define access modes (e.g., ReadWriteOnce, ReadOnlyMany, ReadWriteMany) and security settings for how pods can access the storage, ensuring data integrity and security.

Persistent Volume Claims (PVC)

Persistent Volume Claims act as requests for storage by pods. They are used by developers to specify their storage requirements. Here’s why PVCs are crucial:

  1. Resource Request

PVCs allow developers to request storage resources without needing to know the underlying infrastructure details, making it easier to scale applications.

2. Dynamic Provisioning

When configured with a Storage Class, PVCs can dynamically provision storage resources based on predefined policies, simplifying the provisioning process.

3. Isolation

PVCs isolate storage-related concerns from application code, improving maintainability and portability of applications.

Storage Classes

Storage Classes are an abstraction layer over the underlying storage infrastructure. They define the properties and behavior of PVs dynamically provisioned from them. Storage Classes are valuable for the following reasons:

  1. Dynamic Provisioning

They enable automatic provisioning of PVs with specified characteristics, such as storage type (e.g., SSD, HDD), access mode, and other parameters, simplifying storage management.

2. Resource Optimization

Storage Classes facilitate the utilization of different storage resources based on application requirements, ensuring that workloads receive the right storage configuration.

3. Scaling and Automation

They allow administrators to set up policies and rules for storage allocation, promoting scalability and automation in storage management.

Kubernetes Storage Lifecycle

The Kubernetes storage lifecycle involves a series of stages that storage resources, such as Persistent Volumes (PVs) and Persistent Volume Claims (PVCs), go through from their creation to their eventual removal.

Here’s a high-level overview of the Kubernetes storage lifecycle:

i) Creation

  1. PV Creation: A Persistent Volume (PV) is created by a cluster administrator or through dynamic provisioning based on a StorageClass. The PV represents a physical or networked storage resource available to the cluster.
  2. PVC Creation: A developer creates a Persistent Volume Claim (PVC) to request storage resources for their application. The PVC specifies access modes, storage size, and can reference a specific StorageClass.

ii) Binding

Binding PVC to PV: The Kubernetes control plane attempts to bind the PVC to an available PV that matches the PVC’s requirements. This binding process ensures that the PVC is satisfied by a suitable PV. The PV’s access mode and capacity must match the PVC’s requirements.

iii) Use

Pod Deployment: Pods are deployed in the cluster, referencing the PVC as a volume source. The pod’s containers can read and write data to the PVC mounted within the pod.

iv) Reclamation

Data Usage: The PVC is used by pods for read and write operations, and data is persisted on the underlying storage.

v) Deletion

Pod Termination: When a pod is deleted or terminated, the associated PVC and the data it contains are not immediately deleted.

vi) PVC Removal

PVC Deletion: When a developer or administrator deletes a PVC, the associated PV is released from the claim. However, the PV and data on the underlying storage remain available for reuse.

vii) PV Reclaim Policy

Retain, Delete, or Recycle: The PV may have a specified “reclaim policy” defined. Depending on the policy (e.g., Retain, Delete, Recycle), the PV is either retained for manual cleanup, deleted, or reused with data potentially cleared, according to the policy set.

viii) Manual Cleanup (if required)

Data Removal: In cases where the PV reclaim policy allows for data retention (e.g., Retain or Delete), administrators may need to manually remove data from the storage resource or delete the PV when it is no longer needed.

ix) Reprovisioning (if required)

PV Reuse: If the PV has a reclaim policy that allows for reuse, it may be bound to a new PVC to satisfy the storage requirements of another application.

Steps

This section assumes that you already know how to create a Kubernetes cluster and have created one.

Let us start by creating a Persistant Volume

Navigate to your working directory and create a file named pv.yaml. You can use the following command to create it.

touch pv.yaml

Open the file in your favorite editor and paste the following YAML configuration

apiVersion: v1
kind: PersistentVolume
metadata:
name: local-pv
spec:
capacity:
storage: 500Mi
volumeMode: Filesystem
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: local-storage
hostPath:
path: /mnt/pv-data

Lets break down the configuration

apiVersion: v1 and kind: PersistentVolume indicate that this YAML file describes a Kubernetes resource of kind “PersistentVolume” using the Kubernetes API version 1.

Under the metadata section, name: local-pv assigns a name to the PV resource, which can be used to reference it within the cluster.

In the spec section

capacity: storage: 500Mi specifies the storage capacity of the PV, which is set to 500 megabytes (MiB). This PV provides 500MB of storage space.

volumeMode: Filesystem defines the volume mode as “Filesystem,” indicating that this PV is intended for file-based storage.

accessModes specifies the access modes that pods can use when mounting this PV. In this case, it is set to ReadWriteOnce, meaning that the PV can be mounted as read-write by a single node at a time.

persistentVolumeReclaimPolicy: Retain defines the policy for what happens to the PV after a PVC that uses it is deleted. “Retain” means that the PV’s data is retained even if the associated PVC is deleted, and manual cleanup is required.

storageClassName: local-storage associates this PV with a particular StorageClass named “local-storage.” StorageClasses are used to dynamically provision PVs based on defined policies, but this PV seems to be created manually.

hostPath is used to specify the actual location on the host machine where the storage for this PV is located. In this case, it is set to /mnt/pv-data. The PV’s storage is expected to be found at this path on the node where the PV is created.

Lets now create the above PersistantVolume using the following command

kubectl apply -f pv.yaml

You can view the created PersistantVolume using the following command

kubectl get pv

You should see the following output on your terminal

NAME       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS      CLAIM   STORAGECLASS    REASON   AGE
local-pv 500Mi RWO Retain Available local-storage 11s

The status of our PV indicates that it is available as shown in the terminal. It has not yet been claimed.

We can now create a PersistentVolumeClaim.

Create a file named pvc.yaml . You can use the following command to achieve this

touch pvc.yaml

Copy and paste the following YAML configuration into the pvc.yaml file

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: my-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 200Mi
storageClassName: local-storage

Let’s break down the configuration:

apiVersion: v1 and `kind: PersistentVolumeClaim indicate that this YAML file describes a Kubernetes resource of kind “PersistentVolumeClaim” using the Kubernetes API version 1.

Under the metadata section, name: my-pvc assigns a name to the PVC resource, allowing you to reference it within the cluster.

In the spec section:

accessModes: — ReadWriteOnce specifies the access modes that the PVC requires. In this case, it’s set to ReadWriteOnce, indicating that the PVC can be mounted as read-write by a single node at a time.

resources specifies the resources requested by the PVC.
requests: storage: 200Mi requests storage capacity of 200 megabytes (MiB). This is the minimum storage capacity that the PVC requires from a Persistent Volume (PV).

storageClassName: local-storage associates this PVC with a particular StorageClass named “local-storage.” The StorageClass defines how the requested storage should be provisioned, but in this case, it seems to refer to a manual PV, as there’s no dynamic provisioning involved.

We can create the PVC using the following command

kubectl apply -f pvc.yaml

You can view the created PVC using the following command

kubectl get pvc

You should see the following output on the terminal

NAME     STATUS   VOLUME     CAPACITY   ACCESS MODES   STORAGECLASS    AGE
my-pvc Bound local-pv 500Mi RWO local-storage 116s

Now let us verify that our PVC claimed the PV successfully. Run the following command

kubectl get pv

You shoud see the folowing output on the terminal

NAME       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM            STORAGECLASS    REASON   AGE
local-pv 500Mi RWO Retain Bound default/my-pvc local-storage

Note that the status has changed from Available to Bound indicating that our PV was successfully claimed by the PVC.

Next we are going to create a deployment that will reference the PVC.

Create a file named deployment.yaml and paste the following YAML configuration into the file

apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
spec:
replicas: 1
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:latest
ports:
- containerPort: 80
volumeMounts:
- name: nginx-persistent-storage
mountPath: /usr/share/nginx/html
volumes:
- name: nginx-persistent-storage
persistentVolumeClaim:
claimName: my-pvc

Lets describe the file:

apiVersion: apps/v1 and kind: Deployment indicate that this YAML file describes a Kubernetes resource of kind “Deployment” using the Kubernetes API version “apps/v1.”

Under the metadata section, name: nginx-deployment assigns a name to the Deployment resource.

In the spec section

replicas: 1 specifies that the Deployment should manage one replica (pod) of the application.

selector is used to specify a label selector for the pods managed by this Deployment. In this case, it selects pods with the label app: nginx.

template defines the pod template used to create the pods. It includes the following:
metadata section assigns labels to the pods, with app: nginx label matching the selector defined earlier.
spec section defines the pod’s specifications, including the container(s) to run.
containers specifies the containers within the pod. In this case, there’s one container named “nginx,” which uses the “nginx:latest” Docker image and exposes port 80.
volumeMounts allows you to mount volumes to the container. It specifies that the “nginx-persistent-storage” volume should be mounted at the path “/usr/share/nginx/html” within the container.

- volumes defines the volumes available to the pod. In this case, there’s a single volume named “nginx-persistent-storage,” and it is sourced from a PersistentVolumeClaim (PVC).
persistentVolumeClaim specifies that the volume is created using the PersistentVolumeClaim named “my-pvc.” This means the pod will have access to the storage resources requested by the PVC.

To create the deployment, run the following command

kubectl apply -f deployment.yaml

The deployment has been created and the pod is running. You can verify this by running the following command

kubectl get pods

You should see the following output

NAME                                READY   STATUS    RESTARTS   AGE
deployment-84cff9d74d-m5fjz 1/1 Running 0 6m52s

Now lets get into the container running in the pod and add some files to confirm persistence. Run the following command replacing the pod name with your pod name

kubectl exec -it deployment-84cff9d74d-m5fjz -- /bin/bash

Once in the container, navigate to the mount path. This is the mount path tha was indicated in the deployment spec. You can use the following command to navigate

cd /usr/share/nginx/html

Create a file named sample.txt in this location. Use the following command

touch sample.txt

We know that pods are ephemeral in nature, so if we delete this pod and create a new one, the new pod by itself should not contain the sample.txt file. Data persistence should only be possible if the pod was succesfully linked to the PV through the PVC.

Run the following command to view more details about your current deployment

kubectl get deployments 

You should see the following output on your terminal

NAME               READY   UP-TO-DATE   AVAILABLE   AGE
deployment 1/1 1 1 26m

Delete the deployment using the following command

kubectl delete deployment deployment

After the deployment has been deleted, create a new deployment. This will lead into the creation of new pods. Use the following command

kubectl apply -f deployment.yaml

Once the deployment has been created you can verify that the pods are running using the following command

kubectl get pods

Access the container and navigate to the mount path to verify that the data was persisted and can be accessed in this new container.

You should see the sample.txt file that you created earlier.

Great! I hope this article was helpful

If you have any questions or comments you can leave them at the comment section

Follow, Share and Subscribe

--

--