AKS Cluster NodePool Resizing

Arun Singh
KPMG UK Engineering
4 min readJun 12, 2023

Overview

This document provides step-by-step instructions for resizing the node pool in an Azure Kubernetes Service (AKS) cluster. This method is specific to virtual machine scale set-based AKS clusters.

When Node Pool resizing needed

Here are some common reasons for resizing an AKS cluster node pool:

  1. Performance Optimisation: Increasing the size of the nodes in the node pool can improve the performance of applications running in the AKS cluster. This is especially relevant if you notice resource constraints or performance issues caused by inadequate compute resources.
  2. Cost Optimisation: Downscaling the node pool by reducing the VM size or the number of nodes can help optimise costs, especially if the cluster is consistently underutilised or if resource requirements have decreased.
  3. Application Scaling: As your application workload grows, you may need to scale up the AKS cluster by adding larger VMs or additional nodes to accommodate the increased demand and maintain performance levels.
  4. Resource Requirements: Changing the node pool size might be necessary if your application’s resource requirements have changed.
  5. Workload Variation: If your application experiences fluctuating workloads, you may want to consider automating the resizing of the node pool using tools like the Kubernetes Horizontal Pod Autoscaler (HPA) or the Azure Kubernetes Service (AKS) Cluster Autoscaler. These tools automatically adjust the number of nodes based on workload demands, ensuring optimal resource allocation.

Prerequisites

  • Access to the Azure portal with appropriate permissions to manage the AKS cluster.
  • Kubectl tool installed on your local machine and have access to AKS cluster.
  • Azure CLI installed on your local machine.
  • Familiarity with AKS concepts and basic command-line operations.

Steps to Resize AKS Cluster NodePool

Step 1: Create a new node pool with the desired SKU

In order to resize an existing node pool, assumed named “nodepool1,” from SKU size Standard_DS2_v2 to Standard_DS3_v2, we need to Create a new node pool, assumed named “mynodepool,” using the Standard_DS3_v2 SKU.

Using CLI

az aks nodepool add \ 
--resource-group myResourceGroup \
--cluster-name myAKSCluster \
--name mynodepool \
--node-count 3 \
--node-vm-size Standard_DS3_v2 \
--mode System \
--no-wait

Note: It is a requirement for every AKS cluster to have a minimum of one system node pool with at least one node. In the given scenario, the “ — mode” parameter is set to “System” because it is assumed that the cluster has only one node pool, and a system node pool is necessary to replace it. Please note that the mode of a node pool can be modified as needed at any point in time.

Using Portal

After a few minutes, you can observe the new node pool has been created:

Step 2: Cordon the existing Node Pool

Cordoning is a process that designates specific nodes as unschedulable, effectively preventing any additional pods from being scheduled onto those nodes.

To begin, use the command “kubectl get nodes” to obtain the names of the nodes you wish to cordon. The output will resemble the following format:

NAME                                STATUS   ROLES   AGE     VERSION
aks-nodepool1-31721111-vmss000000 Ready agent 7d21h v1.21.9
aks-nodepool1-31721111-vmss000001 Ready agent 7d21h v1.21.9
aks-nodepool1-31721111-vmss000002 Ready agent 7d21h v1.21.9

Subsequently, utilise the “kubectl cordon <node-names>” command to specify the desired nodes, providing their names in a list separated by spaces.

kubectl cordon aks-nodepool1-31721111-vmss000000 aks-nodepool1-31721111-vmss000001 aks-nodepool1-31721111-vmss000002
Output

node/aks-nodepool1-31721111-vmss000000 cordoned
node/aks-nodepool1-31721111-vmss000001 cordoned
node/aks-nodepool1-31721111-vmss000002 cordoned

With the node pools now cordoned, no additional pods will be rescheduled onto them.

Step 3: Drain the existing Node Pool

When draining nodes, it results in the eviction and recreation of pods running on those nodes on other schedulable nodes, in our case it is mynodepool. To drain nodes, employ the command “kubectl drain <node-names> — ignore-daemonsets — delete-emptydir-data,” using a space-separated list of node names.

Note
Using --delete-emptydir-data is required to evict the AKS-created coredns
and metrics-server pods. If this flag isn't used, an error is expected.
kubectl drain aks-nodepool1-31721111-vmss000000 aks-nodepool1-31721111-vmss000001 aks-nodepool1-31721111-vmss000002 --ignore-daemonsets --delete-emptydir-data

Once the drain operation is completed, all pods, except those controlled by daemon sets, will be running on the new node pool.

Step 4: Monitor and Validation

  1. Monitor the status of the node pool resizing operation. You can use the Azure portal, Azure CLI to check the progress.
  2. Verify that the nodes in the AKS cluster have been resized to the target VM size.
  3. Ensure that your applications and workloads running in the AKS cluster are functioning as expected after the node resizing operation.
  4. Monitor the performance and resource utilisation of your applications to verify that they are running optimally.

Conclusion

This document provided a guide for resizing the node pool in an AKS cluster. By following the outlined steps , you can efficiently resize AKS cluster nodes to meet the needs of your applications. The next step involves considering automation for this process.

--

--