Knative v0.3 Autoscaling — A Love Story

Joseph Burnett
Jan 31, 2019 · 6 min read

2019–06–19 Update: This post has been moved to knative.dev/blog

Scaling in the Knative v0.3 release includes new options for customizing the autoscaling subsystem. From a batteries-included, scale-to-zero default, to an ability to replace the autoscaling system entirely, and everything in between. PodAutoscaler, the new custom resource in Knative, provides an extension and control point with which to configure your application.

To illustrate these options, let’s walk through the evolution of a Web application from inception to complex autoscaling. We promise, “Knative autoscaling will grow old with you.”

I just wanna keep it simple

One morning, you sit bolt upright in bed struck by the realization that what people really want… is love. You write a quick Web application to apply a heart-shaped watermark to any given image in just the right place. Since you’re a savvy, modern application developer, you drop it in a container and spin it up on a Knative Service on GKE with:

knctl deploy -s love -i gcr.io/joe-does-knative/love

This is what your Knative Service looks like

Image for post
Image for post

You show it to your BFF and they post it on Hacker News. Voila, you’re on the front page of the hacker’s Internet! HN tries to give you the hug-of-death, but your application and cluster scales up to handle 1000 op/s of traffic seamlessly. After a while, the excitement dies down and your service is getting 1 or 2 requests per hour. Luckily, Knative scales to zero Pods when not in use, so you don’t spend money running an idle process. And you never changed anything after the initial deployment! That was simple.

Original image: https://cheezburger.com/7892518656/gopher-love

Please don’t go away

Several weeks later, lightning strikes again and you realize… “I could make money doing this”. Clearly people enjoyed the heart-shaped watermarks. Maybe you could let people use your service to watermark images on their entire website! Adding a quick in-memory cache in front of your Ruby script (yeah, it’s Ruby) you redeploy and then start advertising your product as a general image-processing service. Things are going well, but you quickly realize that your traffic is unpredictable — it bursts a lot. And when the service scales to 0 Pods and later the traffic resumes, you spend the first few minutes building up the cache again, which makes request latency a little too high. So you decide to add an annotation to your Knative Service’s Revision template to maintain at least 2 Pods at all times.

This is what your Knative Service looks like now:

Image for post
Image for post

Things aren’t scaling to zero. But that’s fine because you’re making a little money from the venture.

Things are heating up

Woah! Traffic is starting to ramp up. You’re averaging about 500 op/s and running between 10 and 50 Pods depending on the time of day. You’ve noticed that this job is mostly CPU-bound and you’re not utilizing all your resources as efficiently as you could. So you make some adjustments to the default autoscaling target:

Image for post
Image for post

But eventually you conclude that you just need to scale on CPU to keep your machines hot. So you choose a different Knative autoscaling class entirely. The class annotation will tell Knative to use a different PodAutoscaler controller implementation, kinda like Kubernetes Ingress.

Here is your Knative Service now:

Image for post
Image for post

Running at 60% CPU consistently you’re actually starting to make more money than you’re spending! So you quit your day job to pursue heart-shaped watermarking full time.

Let’s get serious

Things are getting complicated. Seeking professional help, you hire Dr. Mark to help you run your service operations. One of the first things he implements is rollout mode for your service. No more leaping before you look!

Image for post
Image for post

Things are smooth sailing with Dr. Mark at the helm! As the weeks roll on, New Year’s Eve approaches and you start seeing non-linear growth in your metrics. A quick consultation with Dr. Mark confirms your worst fears. People go crazy sending each other pictures with hearts on New Year’s Eve. And they do it all at the same time like they are coordinating a DDOS attack of love. You’re going to need a plan.

Luckily, Dr. Mark has been doing his reading in the Knative Serving release notes and begins experimenting with editing the PodAutoscaler on pre-existing Knative Revisions. The PodAutoscaler is where Knative keeps its autoscaling state and configuration for a Knative Revision. And unlike the Knative Revision, it’s mutable (on purpose). You make a plan to ramp up capacity slightly ahead of traffic as it builds to each NYE event across the globe (yeah, it happens 24 times!)

Image for post
Image for post

Over the course of the evening, New Year’s marches from timezone to timezone. You see a few minutes of errors on the first NYE event because the CPU target of 60% is too high. But after you adjust it down to 40% for the next event, it’s smooth sailing for the rest of the evening. Hooray! 🎉

You’re so special

It’s been a full year and things have been crazy! You’ve done some deep integrations with several major image hosting websites and they are driving like 80% of your traffic and revenue now. With a little time on your hands, you start analyzing your autoscaling statistics. You realize that traffic observed by your upstream referrers almost perfectly predicts your traffic patterns. And they can give you those metrics through your API integration!

But you have implemented your CI/CD pipeline to work with Knative. And all your operational experience is in running Knative workloads. It would be a shame to throw all that out just to implement your own autoscaling algorithm. But then you remember something Dr. Mark said way back when he started looking in to Knative v0.3. With the PodAutoscaler custom resource, you can implement your own reconciler and autoscaling system without changing anything else about the Knative Serving system. Well, there you go!

A quick copy of a Kubernetes sample-controller and you’ve implemented a reconciler that operates on your own class of Knative PodAutoscaler. It queries upstream metrics to scale predictively.

This is what you have to change in your Knative Service to wire it up:

Image for post
Image for post

Wow. Controllers and autoscalers are hard to write. But it’s a core problem for your business and you’ve got it up and running. And you didn’t have to touch all the other stuff that wasn’t related to this particular autoscaling problem. As you think on how Knative has grown with your business over the last couple years, you just gotta say “I got options, but Knative … you’re the top one!”

What happened?

To learn more about how the PodAutoscaler works and the options that Knative autoscaling has, please watch the Kubecon talk Knative: Scaling From 0 to Infinity and checkout the code on Github. Or play around with the Knative Serving autoscaling sample.

This is where the PodAutoscaler sits in relation to the other Knative entities:

Image for post
Image for post

Knative

Kubernetes-based platform to build, deploy, and manage…

Joseph Burnett

Written by

Google Programmer working on App Engine and Knative autoscaling.

Knative

Knative

Kubernetes-based platform to build, deploy, and manage modern serverless workloads.

Joseph Burnett

Written by

Google Programmer working on App Engine and Knative autoscaling.

Knative

Knative

Kubernetes-based platform to build, deploy, and manage modern serverless workloads.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store