Safeguarding Your Kubernetes Deployments with a PodDisruptionBudget

Clearwater Analytics Engineering
cwan-engineering
Published in
6 min readJul 24, 2024

In the ever-growing industry of containerized applications, continuous availability of critical applications and workloads is paramount. Enter PodDisruptionBudget (PDB) , a powerful feature within Kubernetes that plays a pivotal role in safeguarding applications against disruptions and downtime.

Disruptions are Normal in Kubernetes World

Have you ever found yourself scratching your head, wondering where your pod disappeared to when it was there just moments ago? These mysterious vanishing acts are what we refer to as involuntary disruptions — unexpected events that abruptly bring down your application’s pod. Some instances include a hardware failure of the physical machine backing the node or the eviction of a pod due to the node being out of resources.

There are other voluntary disruptions as well, such as:

· Updating a deployment’s pod template, causing a restart
· Directly deleting a pod (e.g., by accident)
· Deleting the deployment or other controller that manages the pod
· Draining the node for repair or upgrade

What is PDB?

A PDB is a Kubernetes resource used to ensure high availability and reliability of application-critical workloads during planned disruptions, such as maintenance or upgrades, and unplanned events, such as node failures.

Creating a PodDisruptionBudget YAML File

As an application owner, you can create a PDB for each application. A PDB limits the number of pods of a replicated application that are down simultaneously from voluntary disruptions. Let us first create the persistent volume (PV) and persistent volume claim (PVC) bound to it.

apiVersion: v1
kind: PersistentVolume
metadata:
name: mysql-pv
spec:
capacity:
storage: 512M
accessModes:
- ReadWriteOnce
hostPath:
path: "/data/database"

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: mysql-pvc
spec:
storageClassName: ""
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 400M

Now create the stateful MySQL pod, and let’s say it is running on one of the worker nodes that is scheduled for maintenance. We have to shut down that node for upgrade work, causing MySQL to go down and impacting the entire application.

apiVersion: apps/v1
kind: Deployment
metadata:
name: mysql-database
spec:
selector:
matchLabels:
app: database
replicas: 1
template:
metadata:
labels:
app: database
spec:
containers:
- name: mysql-container
image: mysql:8.4.0
env:
- name: MYSQL_ROOT_PASSWORD
value: password
ports:
- containerPort: 3306
volumeMounts:
- mountPath: /var/lib/mysql
name: database-data
volumes:
- name: database-data
persistentVolumeClaim:
claimName: mysql-pvc

Now, as a cluster admin, I want to run some updates on the node, bringing the node down for some time. I will do the following:

kubectl drain worker-0

Let’s understand how this drain activity works through a diagram.

1. The drain operation involves tainting nodes (with the NoSchedule taint) first so that new pods won’t be scheduled on the nodes.

2. Once the taint has been done on a node, the thread will start evicting the pods on the nodes. As part of this, the drain thread will query the control plane to see if the eviction will cause the service to drop below the configured PDB. So, if no PDB is configured, the pod will be deleted.

3. To prevent this accidental deletion of MySQL, we can create the PDB for MySQL.

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: mysql-pdb
spec:
selector:
matchLabels:
app: database
minAvailable: 1

Here, we need to add a selector to match the pods on which the PDB will be applied and either of these fields.

minAvailable is the number of pods that will always be available (running) and in a healthy state. It can be a number as above or a percentage of pods that will be available.

minAvailable: '50%'

maxUnavailable is the number of pods from that set that can be unavailable after the eviction. It can be either an absolute number or a percentage.

We can either specify minAvailable or maxUnavailable, but not both.

4. Once the PDB is configured, the above scenario will look like this:

After applying the pod disruption budget to your node, when you try to drain the node, you will get the following error:

evicting pod default/mysql-database-7db76854ff-95cwp
error when evicting pods/"mysql-database-7db76854ff-95cwp" -n "default" (will retry after 5s):
Cannot evict pod as it would violate the pod's disruption budget.

As per the Kubernetes docs:

Not all voluntary disruptions are constrained by PDBs. For example, deleting deployments or pods bypasses PDBs.

All this is fine; what about if all the application pods are running and in an unhealthy state? Starting with Kubernetes 1.27, the property unhealthyPodEvictionPolicy is introduced to rescue this situation.

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: mysql-pdb
spec:
selector:
matchLabels:
app: database
minAvailable: 1
unhealthyPodEvictionPolicy: IfHealthyBudget

There are two policies IfHealthyBudget and AlwaysAllow to choose from.

IfHealthyBudget: This policy ensures that running pods of an already disrupted application have the best chance to become healthy. For instance, let’s say the PDB configuration for minAvailable is set to 2 for the above example, and 2 MySQL pods are running but are not healthy for some reason.

Calling drain on the node running 2 MySQL instances will result in a violation of the PDB exception. Even though pods are not healthy, it will wait for the pod to become healthy in this case.

AlwaysAllow: This policy ensures that running pods of a disrupted application might not get a chance to become healthy. Taking the same example as above, but with the alwaysAllow policy set, it will drain the node if pods are not healthy and will not wait for the pods to become healthy.

PDB in conjunction with HPA

If using PDB, ensure that the minimum available is less than the minimum replica at any point in time, irrespective of HPA configurations.

Let’s take an example of how the PDB configuration would look when HPA is configured for the application.

We see a Kubernetes cluster with 3 worker nodes spread across 3 AZs. We have a MySQL application deployed with 2 replicas, and the minAvailable field of PDB is also set to 2. As per this configuration, no disruptions will be allowed for the pod, i.e., when we set the replica of a deployment to be the same as the minAvailable value field inside the PDB, there is no scope left for the application to sustain a disruption, and in that scenario, nodes cannot be cordoned and drained.

Let us consider another scenario where we have the minReplica of HPA as 3 and the minAvailable as 2 for the PDB. This denotes that the application can afford to lose one pod. This can only happen if the replica of the application pod is higher than the minAvailable field of the PDB.

Conclusion

As we’ve seen from the examples, PodDisruptionBudget is a crucial tool in the Kubernetes array, offering a proactive approach to managing disruptions and safeguarding application stability.

About the Author

Aditya Gupta is a senior software development Engineer at Clearwater Analytics, bringing over 13 years of expertise in crafting distributed scalable solutions. Specializing in AWS, Kubernetes, and core Java development. Outside of coding, Aditya finds joy in exploring new destinations, experimenting in the kitchen, and hitting the gym.

--

--