Cluster Autoscaler(CA) and Horizontal Pod Autoscaler(HPA) on Kubernetes
This Blog has moved from Medium to blogs.tensult.com. All the latest content will be available there. Subscribe to our newsletter to stay updated.
Right now our kubernetes cluster and Application Load Balancer are ready. but we need to set up autoscaling methods on kubernetes cluster to successfully running your infrastructure on AWS cloud.
Part -3: Horizontal Pod Autoscaler and Cluster Autoscaler
Horizontal Pod Autoscaler
Autoscaling at pod level this includes the Horizontal Pod Autoscaler (HPA). It scales the pods in a deployment or replica set. It is implemented as a K8s API resource and a controller. The controller manager queries the resource utilization against the metrics specified in each HorizontalPodAutoscaler definition. It obtains the metrics from either the resource metrics API (for per-pod resource metrics), or the custom metrics API (for all other metrics).
Cluster Autoscaler
Autoscaling at the Cluster level, The Cluster Autoscaler (CA) manages scalability by scaling the number of nodes inside your Cluster.
Cluster Autoscaler is a tool that automatically adjusts the size of the Kubernetes cluster when one of the following conditions is true:
- there are pods that failed to run in the cluster due to insufficient resources,
- there are nodes in the cluster that have been underutilized for an extended period of time and their pods can be placed on other existing nodes.
The cluster autoscaler on AWS scales worker nodes within any specified autoscaling group. It will run as a Deployment
in your cluster.
Deploying Metrics Server
Deploy a Metrics Server so that HPA can scale Pods in a deployment based on CPU/memory data provided by an API (as described above). The metrics.k8s.io API is usually provided by the metrics-server (which collects the CPU and memory metrics from the Summary API, as exposed by Kubelet on each node).
helm install stable/metrics-server \
--set rbac.create=true \
--set args[0]="--kubelet-insecure-tls=true" \
--set args[1]="--kubelet-preferred-address-types=InternalIP" \
--set args[2]="--v=2" \
--name metrics-server
Horizontal Pod Autoscaler
The following command will create a Horizontal Pod Autoscaler that maintains between 1 and 10 replicas of the Pods controlled by the PHP-apache deployment we created in the first step of these instructions. Roughly speaking, HPA will increase and decrease the number of replicas (via the deployment) to maintain an average CPU utilization across all Pods of 50% (since each pod requests 200 milli-cores by kubectl run, this means average CPU usage of 100 milli-cores).
Confirm the Metrics API is available
kubectl get apiservice v1beta1.metrics.k8s.io -o yaml
If all is well, you should see a status message similar to the one below in the response. Then it's working fine
status:
conditions:
- lastTransitionTime: "2019-08-20T09:33:01Z"
message: all checks passed
reason: Passed
status: "True"
type: Available
Now we will scale a deployed application
Deploy a sample app and Create HPA resources
We will deploy an application and expose as a service on TCP port 80. The application is a custom-built image based on the php-apache image. The index.php page performs calculations to generate CPU load
kubectl run php-apache --image=k8s.gcr.io/hpa-example --requests=cpu=200m --expose --port=80
Create an HPA resource
This HPA scales up when CPU exceeds 50% of the allocated container resource.
kubectl autoscale deployment php-apache --cpu-percent=50 --min=1 --max=10
View the HPA using kubectl. You probably will see <unknown>/50%
for 1-2 minutes and then you should be able to see 0%/50%
kubectl get hpa
Increase the load by hitting the App K8S service from several locations.
kubectl run -i --tty load-generator --image=busybox /bin/sh
Execute a while loop to continue getting http:///php-apache
while true; do wget -q -O - http://php-apache; done
The HPA should now start to scale the number of Pods in the deployment as the load increases. This scaling takes place according to what is specified in the HPA resources. At some point, the new Pods fall into a ‘pending state’ while waiting for extra resources.
Within a minute or so, we should see the higher CPU load by executing:
kubectl get hpa -w
Here, CPU consumption has increased to the request. As a result, the deployment was resized to replicas:
kubectl get deployment php-apache
You will see HPA scale the pods from 1 up to our configured maximum (10) until the CPU average is below our target (50%)
Configure Cluster Autoscaler
Cluster Autoscaler for AWS provides integration with Auto Scaling groups. It enables users to choose from four different options of deployment
We need to Add/Edit Auto Scaling Group Tags window, enter the following tags by replacing awsExampleClusterName with the name of your EKS cluster.
Key: k8s.io/cluster-autoscaler/enabledKey: k8s.io/cluster-autoscaler/awsExampleClusterName
The worker running the cluster autoscaler will need access to certain resources and actions. We need to attach IAM policy to the node group and avoid using AWS credentials directly unless you have special requirements.
Here we need to create an IAM policy called ClusterAutoScaler based on the following example to give the worker node running the Cluster Autoscaler access to required resources and actions.
{“Version”: “2012–10–17”,“Statement”: [{“Effect”: “Allow”,“Action”: [“autoscaling:DescribeAutoScalingGroups”,“autoscaling:DescribeAutoScalingInstances”,“autoscaling:DescribeLaunchConfigurations”,“autoscaling:DescribeTags”,“autoscaling:SetDesiredCapacity”,“autoscaling:TerminateInstanceInAutoScalingGroup”],“Resource”: “*”}]}
After adding the tag and IAM.
We will run helm chart for creating Cluster Auto scaler
helm install stable/cluster-autoscaler \
--name <release-name> \
--set awsRegion=<region> \
--set sslCertHostPath=/etc/ssl/certs/ca-bundle.crt \
--set autoDiscovery.clusterName=<cluster-name> \
--set rbac.create=true \
--set extraArgs.scale-down-enabled=true
To checks logs
kubectl logs <pod-name> --tail=50
Here we can see the Autoscaling working and its scaling up and down the worker node in kubernetes cluster.
Test Cluster Autoscaler
To see the current number of worker nodes, run the following command:
kubectl get nodes
To increase the number of worker nodes, run the following commands:
kubectl create deployment demo --image=nginxkubectl scale deployment demo --replicas=50
This will create an NGINX image directly on the Kubernetes cluster, and then launches 50 pods
When the number of available pods equals 50, check the number of worker nodes is increasing
To view nodes
kubectl get nodes -w
Delete NGINX deployment
kubectl delete deployment demo
Conclusion
Here we have successfully deploy autoscaling methods for kubernetes on AWS cloud. Now we will work on Kubernetes Dashboard on AWS EKS.
Reference