Vertical Autoscaler (VPA) Explained — Kubernetes

Kiranms
Cloudnloud Tech Community
6 min readMar 18, 2023

Hello Reader , with continuing with my previous blog Scale and Autoscale explained — Kubernetes you learned about HPA , in this blog you will learn about Vertical Autoscaler (VPA) in Kubernetes.

Lets jump into the VPA explanation and then hands-on with steps by steps implementation on Minikube Cluster.

Kubernetes Vertical Pod Autoscaler is a tool that can help us to optimize the use of our cluster’s resources. It does this by automatically adjusting the amount of CPU and memory of our applications use, freeing up resources for other applications. In this guide we will learn how we can setup VPA and make sure it’s working properly for resource management to scale up and down.

When you deploy an application in Kubernetes, you need to specify the CPU and memory requirements for that application. However, these requirements may not always be accurate, and your application may end up using more or less resources than you anticipated. This can lead to inefficient use of resources, and even performance issues.

The VPA tool helps us to address this problem by automatically adjusting the CPU and memory usage of your application based on its actual resource usage. It “Correctly-Sizing” your application by scaling its resource usage up or down as needed, which can free up resources for other applications in the cluster and improve overall cluster resource utilization.

In simpler terms, VPA helps us make sure our applications use just the right amount of resources that they need, and not more than that, so that our cluster can work more efficiently.

with the funny example if I can try to illustrate, lets imagine I have a car that I use to drive to work every day. Sometimes I drive alone, but other times I might have passengers or need to carry a lot of stuff with me. In order to make the most efficient use of my car, I might adjust the seat positions, air conditioning, and other features to fit my needs at any given time.

isn't it funny, same way, VPA tool helps us to adjust the resources allocated to each application based on its actual usage, just like adjusting the features of my car to fit my needs. This helps us ensure that resources are used efficiently and other applications in the cluster have access to the resources they need.

Pre-Requisites

  1. Minikube or Kubernets Cluster
  2. Metrics Server addons enable on Minikube and Metric server is deployed on Kubernetes Cluster.

you can refer my previous Scale and Autoscale explained — Kubernetes where i have shown how to enable Metrics server in Minikube to get the metrics data.

Now lets Deploy Vertical Pod Autoscaler in Minikube Cluster.

  1. Clone the Vertical Pod Autoscaler source code from Kubernetes official github repository & navigate to autoscaler/vertical-pod-autoscaler/hack directory.

# git clone https://github.com/kubernetes/autoscaler.git

Clone the repo from GitHub Kubernetes repo

2. Deploy Vertical Pod Authoscaler Controller on your Cluster with below command.

# ./vpa-up.sh

run vpa-up.sh

3. verify if vpa controller are deployed successfully with below commands

# kubectl get pods -n kube-system or # kubectl get pods -A

in the below screenshot you will see vpa controller are deployed.

VPA Controller Status

OK Wait a minute, I got questions here and you too in your mind here?

Why & What is vpa-admission-controller, vpa-recommender and vpa-updater pods are actually do?

Lets see them one by one.

VPA Admission Controller pod — VPA is a component of the VPA Kubernetes API server extension that automatically sets resource requests and limits on containers based on historical usage and resource requests of a Kubernetes Deployment, StatefulSet or DaemonSet.

Lets say when a new Pod is created, VPA Admission Controller pod analyzes the historical resource usage of the workload deployed on our kubernetes cluster and sets the appropriate resource requests and limits for the containers in the Pod. VPA Admission Controller pod is configured to update resource requests and limits of running Pods based on the recommendation from VPA recommender and then VPA updater updates those recommendation resources accordingly.

VPA Recommender pod — VPA Recommender pod is a core part of the VPA Kubernetes extension, and it makes sure that Kubernetes Pods are scaled correctly and effectively according to its needs for resources. It is always used together with the VPA Admission Controller to automatically adjust the amount of CPU and Memory resources that Kubernetes workload requires.

VPA Updater — VPA Updater Pod is an important component of the VPA Kubernetes extension and it ensures Kubernetes Pods are scaled appropriately and efficiently based on it resource requirement and scaled without causing any disruption to the cluster.

VPA Updater Pod checks periodically recommendations provided by the VPA Recommender pod and comparing them to the current resource requests and limits of the Pods. If the recommendations differ from the current settings, the VPA Updater Pod updates the Pods resource requests and limits based on the recommendation.

Now let us deploy the VPA controller and then nginx deployment to check the VPA funcationality.

Create your Vertical Pod Autoscaler by running the below command.

# kubectl create -f my-nginx-vpa.yaml

VPA deployed

Now lets deploy the sample nginx deployment and see how VPA scale the reqeuirement.

# kubectl create -f my-nginx-vpa-deploy.yaml

VPA Recommendation

Here you will see “Update Policy” set to “Auto” that has been suggested by your VPA, that mean your VPA is setup to Auto mode & when you set it to auto mode your VPA will continuously monitor the resource utilisation of your workload and will adjust based on the update policy.

there are 4 types of value that can be set in “Update Mode” those are as below

  1. “Off” — This mode is used when you want yo apply the resource requirement manually without VPA.
  2. “Auto” — This mode is applied when you want VPA to scale the resources .
  3. “Recreate” — This value means that all existing pods will be terminated before new pods are created with updated resource requirements.
  4. “Initial” — This value means that recommended updates will be applied when a pod is first started & not during the running of the pod.

You will also see “Recommendation” suggested by VPA.

here “Recommendation” is a suggested change to the resource requirements of a pod, based on the current demand and usage patterns of the application running in the pod.

Here you will notice recommendation “Lower Bound”, “Target”, “Uncapped Target” & “Upper Bound”

  1. “Lower Bound” — is minimum recommended CPU request and memory request for the container.
  2. “Target” — is recommended CPU request and memory request for the container.
  3. “Uncapped Target” — is most recent resource recommendation computed by the autoscaler, based on actual resource usage.
  4. “Upper Bound” — is maximum recommended CPU request and memory request for the container.

You will notice now the your deployment is terminating and recreating with the latest resource limits recommended by VPA controller.

let us describe the Pod and see the new resource limit scaled by VPA.

you can notice in “Requests” now our VPA scaled both the “nginx-vpa-5bc64579bf-bcg74" pod with the new “Target” Recommendations

Please cleanup all the deployment that you have deployed.

# Clean-up nginx and nginx VPA

# kubectl delete -f .

then cleanup your VPA controller that you had deploy in the start.

Navigate to the “/autoscaler/vertical-pod-autoscaler/hack” path that we cloned.

Happy Learning !!!!

--

--