Kubernetes Storage: PV, PVC and Storage Class
In the ever-evolving landscape of container orchestration, Kubernetes stands as the undisputed leader, providing developers and operations teams with a powerful platform for managing, scaling, and deploying containerized applications. However, as the complexity of applications running on Kubernetes continues to grow, so does the need for efficient and reliable storage solutions.
Persistent Volumes (PV), Persistent Volume Claims (PVC), and Storage Classes are the cornerstones of Kubernetes storage management. In this article, we look at these fundamental components that underpin Kubernetes storage, empowering you to make informed decisions and maximize the potential of your containerized applications.
Kubernetes storage is not merely an afterthought; it’s a critical aspect of your infrastructure that can impact performance, reliability, and scalability. This article serves as your guide to demystify these concepts, showcasing their importance in the world of container orchestration and how they work together to ensure your applications have access to the right storage resources at the right time.
Why should you use a PV, PVC or a Storage Class?
In Kubernetes, Persistent Volumes (PV), Persistent Volume Claims (PVC), and Storage Classes are essential components that work together to address various storage needs and provide a robust storage management solution.
Here’s why we use each of these components:
Persistent Volumes (PV)
Persistent Volumes are a way to abstract and represent physical or networked storage resources in a cluster. They serve as the “backend” storage configuration in a Kubernetes cluster. PVs are crucial for the following reasons:
- Resource Abstraction
PVs abstract the underlying storage, making it easier to manage storage resources independently of the applications that use them.
2. Resource Management
PVs enable administrators to allocate and manage storage resources, allowing for better utilization and optimization of storage hardware.
3. Data Persistence
PVs ensure data persistence, even if pods or containers are recreated or rescheduled. This is vital for stateful applications and databases.
4. Access Control
PVs define access modes (e.g., ReadWriteOnce, ReadOnlyMany, ReadWriteMany) and security settings for how pods can access the storage, ensuring data integrity and security.
Persistent Volume Claims (PVC)
Persistent Volume Claims act as requests for storage by pods. They are used by developers to specify their storage requirements. Here’s why PVCs are crucial:
- Resource Request
PVCs allow developers to request storage resources without needing to know the underlying infrastructure details, making it easier to scale applications.
2. Dynamic Provisioning
When configured with a Storage Class, PVCs can dynamically provision storage resources based on predefined policies, simplifying the provisioning process.
3. Isolation
PVCs isolate storage-related concerns from application code, improving maintainability and portability of applications.
Storage Classes
Storage Classes are an abstraction layer over the underlying storage infrastructure. They define the properties and behavior of PVs dynamically provisioned from them. Storage Classes are valuable for the following reasons:
- Dynamic Provisioning
They enable automatic provisioning of PVs with specified characteristics, such as storage type (e.g., SSD, HDD), access mode, and other parameters, simplifying storage management.
2. Resource Optimization
Storage Classes facilitate the utilization of different storage resources based on application requirements, ensuring that workloads receive the right storage configuration.
3. Scaling and Automation
They allow administrators to set up policies and rules for storage allocation, promoting scalability and automation in storage management.
Kubernetes Storage Lifecycle
The Kubernetes storage lifecycle involves a series of stages that storage resources, such as Persistent Volumes (PVs) and Persistent Volume Claims (PVCs), go through from their creation to their eventual removal.
Here’s a high-level overview of the Kubernetes storage lifecycle:
i) Creation
- PV Creation: A Persistent Volume (PV) is created by a cluster administrator or through dynamic provisioning based on a StorageClass. The PV represents a physical or networked storage resource available to the cluster.
- PVC Creation: A developer creates a Persistent Volume Claim (PVC) to request storage resources for their application. The PVC specifies access modes, storage size, and can reference a specific StorageClass.
ii) Binding
Binding PVC to PV: The Kubernetes control plane attempts to bind the PVC to an available PV that matches the PVC’s requirements. This binding process ensures that the PVC is satisfied by a suitable PV. The PV’s access mode and capacity must match the PVC’s requirements.
iii) Use
Pod Deployment: Pods are deployed in the cluster, referencing the PVC as a volume source. The pod’s containers can read and write data to the PVC mounted within the pod.
iv) Reclamation
Data Usage: The PVC is used by pods for read and write operations, and data is persisted on the underlying storage.
v) Deletion
Pod Termination: When a pod is deleted or terminated, the associated PVC and the data it contains are not immediately deleted.
vi) PVC Removal
PVC Deletion: When a developer or administrator deletes a PVC, the associated PV is released from the claim. However, the PV and data on the underlying storage remain available for reuse.
vii) PV Reclaim Policy
Retain, Delete, or Recycle: The PV may have a specified “reclaim policy” defined. Depending on the policy (e.g., Retain, Delete, Recycle), the PV is either retained for manual cleanup, deleted, or reused with data potentially cleared, according to the policy set.
viii) Manual Cleanup (if required)
Data Removal: In cases where the PV reclaim policy allows for data retention (e.g., Retain or Delete), administrators may need to manually remove data from the storage resource or delete the PV when it is no longer needed.
ix) Reprovisioning (if required)
PV Reuse: If the PV has a reclaim policy that allows for reuse, it may be bound to a new PVC to satisfy the storage requirements of another application.
Steps
This section assumes that you already know how to create a Kubernetes cluster and have created one.
Let us start by creating a Persistant Volume
Navigate to your working directory and create a file named pv.yaml
. You can use the following command to create it.
touch pv.yaml
Open the file in your favorite editor and paste the following YAML configuration
apiVersion: v1
kind: PersistentVolume
metadata:
name: local-pv
spec:
capacity:
storage: 500Mi
volumeMode: Filesystem
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: local-storage
hostPath:
path: /mnt/pv-data
Lets break down the configuration
apiVersion: v1
and kind: PersistentVolume
indicate that this YAML file describes a Kubernetes resource of kind “PersistentVolume” using the Kubernetes API version 1.
Under the metadata
section, name: local-pv
assigns a name to the PV resource, which can be used to reference it within the cluster.
In the spec
section
capacity: storage: 500Mi
specifies the storage capacity of the PV, which is set to 500 megabytes (MiB). This PV provides 500MB of storage space.
volumeMode: Filesystem
defines the volume mode as “Filesystem,” indicating that this PV is intended for file-based storage.
accessModes
specifies the access modes that pods can use when mounting this PV. In this case, it is set to ReadWriteOnce
, meaning that the PV can be mounted as read-write by a single node at a time.
persistentVolumeReclaimPolicy: Retain
defines the policy for what happens to the PV after a PVC that uses it is deleted. “Retain” means that the PV’s data is retained even if the associated PVC is deleted, and manual cleanup is required.
storageClassName: local-storage
associates this PV with a particular StorageClass named “local-storage.” StorageClasses are used to dynamically provision PVs based on defined policies, but this PV seems to be created manually.
hostPath
is used to specify the actual location on the host machine where the storage for this PV is located. In this case, it is set to /mnt/pv-data
. The PV’s storage is expected to be found at this path on the node where the PV is created.
Lets now create the above PersistantVolume using the following command
kubectl apply -f pv.yaml
You can view the created PersistantVolume using the following command
kubectl get pv
You should see the following output on your terminal
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
local-pv 500Mi RWO Retain Available local-storage 11s
The status of our PV indicates that it is available as shown in the terminal. It has not yet been claimed.
We can now create a PersistentVolumeClaim.
Create a file named pvc.yaml
. You can use the following command to achieve this
touch pvc.yaml
Copy and paste the following YAML configuration into the pvc.yaml
file
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: my-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 200Mi
storageClassName: local-storage
Let’s break down the configuration:
apiVersion: v1
and `kind: PersistentVolumeClaim
indicate that this YAML file describes a Kubernetes resource of kind “PersistentVolumeClaim” using the Kubernetes API version 1.
Under the metadata
section, name: my-pvc
assigns a name to the PVC resource, allowing you to reference it within the cluster.
In the spec section:
accessModes: — ReadWriteOnce
specifies the access modes that the PVC requires. In this case, it’s set to ReadWriteOnce
, indicating that the PVC can be mounted as read-write by a single node at a time.
resources
specifies the resources requested by the PVC.requests: storage: 200Mi
requests storage capacity of 200 megabytes (MiB). This is the minimum storage capacity that the PVC requires from a Persistent Volume (PV).
storageClassName: local-storage
associates this PVC with a particular StorageClass named “local-storage.” The StorageClass defines how the requested storage should be provisioned, but in this case, it seems to refer to a manual PV, as there’s no dynamic provisioning involved.
We can create the PVC using the following command
kubectl apply -f pvc.yaml
You can view the created PVC using the following command
kubectl get pvc
You should see the following output on the terminal
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
my-pvc Bound local-pv 500Mi RWO local-storage 116s
Now let us verify that our PVC claimed the PV successfully. Run the following command
kubectl get pv
You shoud see the folowing output on the terminal
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
local-pv 500Mi RWO Retain Bound default/my-pvc local-storage
Note that the status has changed from Available
to Bound
indicating that our PV was successfully claimed by the PVC.
Next we are going to create a deployment that will reference the PVC.
Create a file named deployment.yaml
and paste the following YAML configuration into the file
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
spec:
replicas: 1
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:latest
ports:
- containerPort: 80
volumeMounts:
- name: nginx-persistent-storage
mountPath: /usr/share/nginx/html
volumes:
- name: nginx-persistent-storage
persistentVolumeClaim:
claimName: my-pvc
Lets describe the file:
apiVersion: apps/v1
and kind: Deployment
indicate that this YAML file describes a Kubernetes resource of kind “Deployment” using the Kubernetes API version “apps/v1.”
Under the metadata
section, name: nginx-deployment
assigns a name to the Deployment resource.
In the spec section
replicas: 1
specifies that the Deployment should manage one replica (pod) of the application.
selector
is used to specify a label selector for the pods managed by this Deployment. In this case, it selects pods with the label app: nginx
.
template
defines the pod template used to create the pods. It includes the following:
— metadata
section assigns labels to the pods, with app: nginx
label matching the selector defined earlier.
— spec
section defines the pod’s specifications, including the container(s) to run.
— containers
specifies the containers within the pod. In this case, there’s one container named “nginx,” which uses the “nginx:latest” Docker image and exposes port 80.
— volumeMounts
allows you to mount volumes to the container. It specifies that the “nginx-persistent-storage” volume should be mounted at the path “/usr/share/nginx/html” within the container.
- volumes
defines the volumes available to the pod. In this case, there’s a single volume named “nginx-persistent-storage,” and it is sourced from a PersistentVolumeClaim (PVC).
— persistentVolumeClaim
specifies that the volume is created using the PersistentVolumeClaim named “my-pvc.” This means the pod will have access to the storage resources requested by the PVC.
To create the deployment, run the following command
kubectl apply -f deployment.yaml
The deployment has been created and the pod is running. You can verify this by running the following command
kubectl get pods
You should see the following output
NAME READY STATUS RESTARTS AGE
deployment-84cff9d74d-m5fjz 1/1 Running 0 6m52s
Now lets get into the container running in the pod and add some files to confirm persistence. Run the following command replacing the pod name with your pod name
kubectl exec -it deployment-84cff9d74d-m5fjz -- /bin/bash
Once in the container, navigate to the mount path. This is the mount path tha was indicated in the deployment spec. You can use the following command to navigate
cd /usr/share/nginx/html
Create a file named sample.txt
in this location. Use the following command
touch sample.txt
We know that pods are ephemeral in nature, so if we delete this pod and create a new one, the new pod by itself should not contain the sample.txt
file. Data persistence should only be possible if the pod was succesfully linked to the PV through the PVC.
Run the following command to view more details about your current deployment
kubectl get deployments
You should see the following output on your terminal
NAME READY UP-TO-DATE AVAILABLE AGE
deployment 1/1 1 1 26m
Delete the deployment using the following command
kubectl delete deployment deployment
After the deployment has been deleted, create a new deployment. This will lead into the creation of new pods. Use the following command
kubectl apply -f deployment.yaml
Once the deployment has been created you can verify that the pods are running using the following command
kubectl get pods
Access the container and navigate to the mount path to verify that the data was persisted and can be accessed in this new container.
You should see the sample.txt
file that you created earlier.
Great! I hope this article was helpful
If you have any questions or comments you can leave them at the comment section
Follow, Share and Subscribe