Knative v0.3 Autoscaling — A Love Story

Joseph Burnett
Jan 31 · 6 min read

2019–06–19 Update: This post has been moved to knative.dev/blog

Scaling in the Knative v0.3 release includes new options for customizing the autoscaling subsystem. From a batteries-included, scale-to-zero default, to an ability to replace the autoscaling system entirely, and everything in between. PodAutoscaler, the new custom resource in Knative, provides an extension and control point with which to configure your application.

To illustrate these options, let’s walk through the evolution of a Web application from inception to complex autoscaling. We promise, “Knative autoscaling will grow old with you.”

I just wanna keep it simple

knctl deploy -s love -i gcr.io/joe-does-knative/love

This is what your Knative Service looks like

You show it to your BFF and they post it on Hacker News. Voila, you’re on the front page of the hacker’s Internet! HN tries to give you the hug-of-death, but your application and cluster scales up to handle 1000 op/s of traffic seamlessly. After a while, the excitement dies down and your service is getting 1 or 2 requests per hour. Luckily, Knative scales to zero Pods when not in use, so you don’t spend money running an idle process. And you never changed anything after the initial deployment! That was simple.

Original image: https://cheezburger.com/7892518656/gopher-love

Please don’t go away

This is what your Knative Service looks like now:

Things aren’t scaling to zero. But that’s fine because you’re making a little money from the venture.

Things are heating up

But eventually you conclude that you just need to scale on CPU to keep your machines hot. So you choose a different Knative autoscaling class entirely. The class annotation will tell Knative to use a different PodAutoscaler controller implementation, kinda like Kubernetes Ingress.

Here is your Knative Service now:

Running at 60% CPU consistently you’re actually starting to make more money than you’re spending! So you quit your day job to pursue heart-shaped watermarking full time.

Let’s get serious

Things are smooth sailing with Dr. Mark at the helm! As the weeks roll on, New Year’s Eve approaches and you start seeing non-linear growth in your metrics. A quick consultation with Dr. Mark confirms your worst fears. People go crazy sending each other pictures with hearts on New Year’s Eve. And they do it all at the same time like they are coordinating a DDOS attack of love. You’re going to need a plan.

Luckily, Dr. Mark has been doing his reading in the Knative Serving release notes and begins experimenting with editing the PodAutoscaler on pre-existing Knative Revisions. The PodAutoscaler is where Knative keeps its autoscaling state and configuration for a Knative Revision. And unlike the Knative Revision, it’s mutable (on purpose). You make a plan to ramp up capacity slightly ahead of traffic as it builds to each NYE event across the globe (yeah, it happens 24 times!)

Over the course of the evening, New Year’s marches from timezone to timezone. You see a few minutes of errors on the first NYE event because the CPU target of 60% is too high. But after you adjust it down to 40% for the next event, it’s smooth sailing for the rest of the evening. Hooray! 🎉

You’re so special

But you have implemented your CI/CD pipeline to work with Knative. And all your operational experience is in running Knative workloads. It would be a shame to throw all that out just to implement your own autoscaling algorithm. But then you remember something Dr. Mark said way back when he started looking in to Knative v0.3. With the PodAutoscaler custom resource, you can implement your own reconciler and autoscaling system without changing anything else about the Knative Serving system. Well, there you go!

A quick copy of a Kubernetes sample-controller and you’ve implemented a reconciler that operates on your own class of Knative PodAutoscaler. It queries upstream metrics to scale predictively.

This is what you have to change in your Knative Service to wire it up:

Wow. Controllers and autoscalers are hard to write. But it’s a core problem for your business and you’ve got it up and running. And you didn’t have to touch all the other stuff that wasn’t related to this particular autoscaling problem. As you think on how Knative has grown with your business over the last couple years, you just gotta say “I got options, but Knative … you’re the top one!”

What happened?

This is where the PodAutoscaler sits in relation to the other Knative entities:

Knative

Kubernetes-based platform to build, deploy, and manage modern serverless workloads.

Joseph Burnett

Written by

Google Programmer working on App Engine and Knative autoscaling.

Knative

Knative

Kubernetes-based platform to build, deploy, and manage modern serverless workloads.