A Cloud-Native Coda: Why You (probably) Don’t Need Elastic Scaling

Kyle Gene Brown
The Startup
Published in
4 min readJan 6, 2021

--

Kyle Brown and Kim Clark

One of the most common features of Cloud Native development that we constantly hear touted as being of supreme importance is elastic scaling. Many companies have told us that they see taking advantage of Elastic Scaling as being a key requirement for their teams evaluating cloud platforms. However, we rarely hear those same teams tell us why they need elastic scaling.

In fact, we might go so far as saying Elastic scaling is one of the hallmarks of “being on the cloud”. All cloud platforms that bear up under the name have some sort of elastic scaling support. Whether that be Horizontal Pod Autoscalers in Kubernetes (which will automatically scale the number of Pods based on observed CPU utilization) or vendor features such as AWS Autoscaling, which will automatically scale EC2 instances, Dynamo DB tables and many other resource types, elastic scaling is often viewed as being the single most desirable feature of the cloud.

If you are a brand new startup building a new B2C application, then elastic scaling might be critically important because you can’t predict when your business will suddenly take off. But Enterprises are not startups. They have existing customer bases, and their usage patterns are, for the most part, well-known. Instead, in most Enterprises we see a split of all the different workloads that the company as a whole run divide into approximately the following categories:

  • The biggest category is static load which is predictable, unchanging load; that may be as much as 65% of all applications.
  • The next biggest is planned scaling (seasonal, batch, or planned campaigns) — perhaps as much as 35% of all applications.
  • What remains is unplanned scaling (unpredictable load); that remainder can be very small — perhaps as little as 5% of applications in many Enterprises.

We show this split (taken from a real customer, but shown here for illustrative purposes only) below:

Scaling percentages

The problem is that many teams want to build their applications today as if they would be part of the 5%, when in fact only very few applications fall within that subset that will ever be hit by unplanned load. Now, this is not a new problem. In more traditional application environments, the common approach was always to put far too much infrastructure in place “just in case”. That was obviously wasteful, and one of the reasons why teams wanted to move to the cloud. However, the utopian cloud native approach would be to assume you must deploy every function such that it can scale infinitely. However, for the types of big enterprises that we work with, the better approach is instead to attempt to identify those 5% of functions, separate them from the monolith, and build them in a cloud native way (they can be good early candidates for the strangler pattern, for example).

So as a result, assuming that all your Cloud native programs must be elastic and infinitely scalable is usually misleading — instead, for 95% of enterprise applications, what is much more important is resilience rather than elastic scaling. What we need instead is horizontal stability, which is a necessary condition to achieve Horizontal Scaling if, in fact, you’re in that 5% of programs that actually need that.

So in other words, your program should continue to run in a stable way, without service interruption, if a node or instance is lost, replaced or restarted — which is much more common than suddenly needing extra nodes for scaling. Now it turns out that key cloud native ingredients such as loose coupling, not using shared databases, and inter-process communication only through standard, scalable protocols like HTTP and messaging systems is a good way to achieve this. So writing applications in a cloud native way is still the right thing to do, but perhaps not for the reasons you might think.

What’s more, elastic scaling turns out to be an anti-pattern for many enterprises. For instance, when we spoke to one bank about elastic software licensing models, we were told that because of the way they did their application cost planning, they actually needed up-front fixed software costs for their business cases for new projects. This is, unfortunately, true in far too many enterprises we work with.

One of the recurring horror stories we often hear (so often as to wish it were apocryphal, but unfortunately we have witnessed multiple instances of this) is that a team deploys a new cloud native application with autoscaling turned on, only to result in an enormous first-month cloud provider bill due to autoscaling, not because of high customer use, but because of an unexpected error not encountered in testing that resulted in anomalously high CPU utilization when under normal loads. Until costing models and financial planning catches up with the technology, you may be best off building applications in a cloud native way, but deploying either in fixed capacities, or with capacity-limited autoscaling setups.

--

--

Kyle Gene Brown
The Startup

IBM Fellow, CTO for Cloud Architecture for the IBM Garage, author and blogger