Kubernetes Rant: Do Autoscalers make Business sense?

Recently, Daniel Polencic posted a lovely article on Autoscalers, this triggered some thoughts, hence writing this quick short note as a response to Daniel’s article.

Need For Autoscalers.

We observe that autoscalers are typically needed for a couple of use cases, they are

  1. In a highly dynamic environment where there is a large difference between peak vs average utilization (mostly driven by traffic) then one would provision resources for average utilization and make the infrastructure elastic for peak utilization. It should be observed that the peak occurs once in a while but not way too often otherwise avg utilization would be close to peak utilization. In such cases, we tend to use a combination of Pod Autoscaler & Cluster Autoscaler for handling the elasticity.
  2. In a relatively static environment where there is not much difference between peak vs avg utilization or when new services/new teams/transient services get onboarded, makes a strong case for Cluster Autoscaler but the lesser need for Pod Autoscaler unless there is shared resources.

Decision Factors For Cluster Autoscaler.

In a dynamic environment, one has to answer the two key questions before deciding on cluster autoscaler

  1. How often does the peak occur?. once every day, once every week or once every month?.
  2. What’s the SLO for Availability?

There is a mismatch when businesses provide five 9’s to their customers while infrastructure relies on cluster autoscaler to handle peaks even once a month. The math is

  1. Five 9’s can have only 5.26 mins of downtime per year. Source Wiki.
  2. Assuming 25% impact to number of request, 7 * 12 * 25 / 100 = 21 mins of downtime per year. If you see peaks once daily then 30 * 21 = 630 mins of downtime per year.

IMO, Cluster Autoscaler makes sense where the environments are relatively controlled and in environments where there is no big business impact such as DEV/TEST etc.

Decision Factors For Pod Autoscaler.

Pod Autoscaler assumes that there are sufficient resources available for the pod to scale horizontally. The questions to answer are

  1. When load peaks, does all services or only a set of services scale?
  2. If only a set of services scale, does it already have sufficient provisioned resources?

To answer these questions, one needs to understand the application, its architecture, and its bottlenecks in the system. Therefore, one size(solution) may not fit all.


Autoscalers have their use but it’s a much harder problem than it appears to be. It involves understanding a bit of business, architecture, and cost aspects to arrive at what is best for the organization rather than jumping on the bandwagon. Thanks to Daniel Polenci for triggering this post.

As always, I would love to hear your thoughts & feedback.



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store