Cluster Autoscaler in Amazon EKS

Cluster Autoscaler automatically adjusts the number of nodes in a Kubernetes cluster when there are insufficient capacity errors to launch new pods, and also decreases the number of nodes when they are underutilized.

Autoscaler adjust the number of nodes by changing the desired capacity of a AWS Autoscaling Group.


The first step is to find the AWS Autoscaling Group that you are using for your worker nodes. If you are using the CloudFormation template provided by EKS to launch your worker nodes you will find the AutosScaling Group name in the CloudFormation console.

Click on the Physical ID link to go to the Autoscaling Group console.
Add the following tags:

k8s.io/cluster-autoscaler/enabled
k8s.io/cluster-autoscaler/<ClusterName>

By adding this tag Cluster Autoscaler will be able to use “AutoDiscovery” — detecting automatically the AutoScaling Group and its minimum and desired capacity.

Second. Download Cluster Autoscaler charts with Helm and uncompress the package:

helm repo update
helm fetch stable/cluster-autoscaler
tar -zxf cluster-autoscaler-0.6.4.tgz

Third. Edit values.yaml. Make sure you find and edit the following lines:

clusterName: <Your EKS cluster name goes here>
awsRegion: <The AWS region where the AutoScaling group is running>
sslCertPath: /etc/kubernetes/pki/ca.crt
rbac:
 ## If true, create & use RBAC resources
 ##
 create: true

Note: very important the sslCertPath line! By default Cluster Autoscaler looks up for the EKS certificate in a different location. AMI images provided by AWS in the default template (as of today) have the EKS certificate in this location.

Fourth

Add permissions to the instances IAM role to allow Cluster Autoscaler to change the desired capacity. You can find the IAM role used in the Resources section of the CloudFormation stack (as shown in the screenshot above).

{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"autoscaling:DescribeAutoScalingGroups",
"autoscaling:DescribeAutoScalingInstances",
"autoscaling:DescribeLaunchConfigurations",
"autoscaling:DescribeTags",
"autoscaling:SetDesiredCapacity",
"autoscaling:TerminateInstanceInAutoScalingGroup"
],
"Resource": "*"
}
]
}

https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/aws/README.md

Fifth. Deploy Cluster Autoscaler!

helm install stable/cluster-autoscaler -f values.yaml — name my-release

Testing!

Scale up: Launching 200 Nginx pods.

kubectl run example --image=nginx --port=80 --replicas=200

Check the logs, you should see the cluster scaling up.

kubectl logs -l "app=aws-cluster-autoscaler" --tail=500        
kubectl get configmap cluster-autoscaler-status -o yaml

You should also see the desired count increasing:

Scale down:

kubectl delete deployment example

Note: Scale down can take a few minutes. For more information about how it works go to the documentation here:
https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md

Considerations:

Remember to perform any change in your MinSize and MaxSize of your Auto Scaling Group through CloudFormation updates.