AWS compute cost optimization using spot instances within Kubernetes

Tarun Prakash
MiQ Tech and Analytics
4 min readMay 7, 2018

Background

Kubernetes is a widely-used open-source system today. It has grown immensely in the last few years and has become the primary container management platform for people to manage their containerized applications.

Similar to the likes of other AWS customers, we run our kubernetes workload on AWS. According to the Cloud Native Computing Foundation, 63% of Kubernetes workloads run on AWS. While AWS is a popular place to run Kubernetes, customers still have to do a lot of manual configuration to setup and manage their Kubernetes clusters. Even though AWS EKS has arrived, it’s still in preview mode and not ready for production. More importantly, it wasn’t available when we started our journey with kubernetes last year.

Rancher provided an easier way to setup and manage kubernetes on AWS. It is an open-source container management platform that makes setting up and managing kubernetes an easy job on AWS.

While explaining rancher and kubernetes is out of the scope of this article, we would guide you on how you can make use of AWS spot instances as kubernetes worker nodes by integration rancher and spotinst.

Benefits & the challenges with spot instances

Spot instances is a great way to reduce compute cost on AWS. They can be up to 10x cheaper than on-demand instances. Usually, short-lived jobs and workflows are perfect use cases for leveraging spot instances due to their limitations. Further, applications that can tolerate node failures can also benefit from using spot instances.

Spot instances seem to be great option to reduce compute cost. However, they are not recommended for production due to their limitations such as their availability is subject to available unused compute capacity in AWS. Therefore, managing applications high availability on spot instances on AWS is a big challenge.

Spotinst role in the setup

Spotinst claims to address the above challenge and save 80% compute cost by leveraging cloud’s excess compute capacity, without compromising on availability. We use this facility and let spotinst manage spot instances within kubernetes cluster.

How we did it @MIQ — Architecture overview

Let’s integrate spotinst with rancher by using Terraform.

Clone the repository and change the required values in values.yaml and run the following commands.

Terraform plan ( dry-run )
Terraform apply ( apply changes )

root@ip-12–*–*-46 $terraform plan
var.rancher_env
Enter rancher ENV name
Enter a value: kubernetes
var.spotinst_account
Enter spotinst account ID
Enter a value: *******
var.spotinst_token
Enter spotinst token
Enter a value: *******************************

If everything looks good during Terraform plan, we can apply changes.

root@ip-12–*–*–46 $terraform apply
var.rancher_env
Enter rancher ENV name
Enter a value: kubernetes
var.spotinst_account
Enter spotinst account ID
Enter a value: *******
var.spotinst_token
Enter spotinst token
Enter a value: *******************************

If everything goes well, you should see spot instances getting registered with rancher environment.

Verify the added instance in spotinst dashboard as well as follows.

If you see the same instance IP in the rancher ENV and as well as in the Spotinst dashboard that means integration is successful.

Same can also easily be achieved by creating Elastic Group via Spotinst dashboard ( https://help.spotinst.com/hc/en-us/articles/115002041945-Creating-an-Elastigroup). However, we needed an automated way to create Elastic Group so that launched spot instances can automatically register with our cluster.

Finally, you can look at the potential saving that you done since the launch of your spot instances right in the Spotinst dashboard. Here is how the dashboard looks like.

Conclusion

In this article, we have shown how we can optimize the compute cost by using AWS spot instances within kubernetes without compromising on availability.

If you are managing the kubernetes cluster on AWS using rancher and looking to reduce the infrastructure cost, do try out this solution and share your thoughts.

--

--

Tarun Prakash
MiQ Tech and Analytics

DevOps Lead ( Linux | AWS | Docker | Kubernetes | Jenkins | CI/CD | Automation | Terraform | Rancher)