The Nightmare of Persistent Volumes with Multiple Availability Zones Kubernetes Cluster.

Calvine Otieno
3 min readJan 14, 2024

--

If you have been running Kubernetes Cluster in AWS with EBS volumes as PV setup, you must have faced this issue of evicted pods after they have been rescheduled on another node.

EBS volumes in AWS are zone-specific resources since they work via networking and are dedicated to a specific datacentre.

Running a multi-az Kubernetes setup with Persistent volumes will give you a nightmare when pods get re-scheduled to a new node that is not in the same AZ where the volume was initially created. The pods will remain in a pending state since they will not have access to the Persistent Volume.

You will see this example error:

Kubernetes Pod Warning: 1 node(s) had volume node affinity conflict

Solution

Kubernetes has the simplest form of assigning pods to the nodes with a node selection constraint called nodeSelector. Kubernetes will only schedule Pods to the Nodes that have the labels you specified.

This idea is the best way to make sure pods that use PV have the nodeSelector specified. This provides a fixed availability zone so that even if the pods are re-scheduled to other nodes, they will land in the same availability zone as the volume. There is one catch here, you have to be sure the availability zones you specified will always have node(s) with enough resources to run your applications or else your pods will still remain in pending states.

StorageClass

The example here is for AWS Cloud but this should work on any Cloud Provider. One important thing in the storage class is to add volumeBindingMode: WaitForFirstConsumer parameter that will delay the binding and provisioning of a persistent volume until a pod is created.

AWS StorageClass

If you are using AWS EBS CSI Driver, this parameter is set by default when you install the driver.

Persistent Volume

Reference the StorageClasss in your Persistent Volume.

Example PVC

Deployment

Set the nodeSelector for the deployment which specifies in which zone the pod will be deployed.

Example Deployment

This should help you fix the issue with PVs with a multi-az Kubernetes Cluster. There are cases where you want to run multiple replicas on different availability zones that is not covered in this article. I will cover that later in the coming days.

This is all for now. I hope you have learnt something and enjoyed reading the article. Follow me on GitHub for more about DevOps, DevSecOps and Platform Engineering.

References

Official Kubernetes Documentation on this Limitation:

Thanks for reading. Let’s connect on Twitter and LinkedIn 😁.

--

--

Calvine Otieno
Calvine Otieno

Written by Calvine Otieno

DevOps Engineer. Platform Engineer. Tech Storyteller.

Responses (1)