EXPEDIA GROUP TECHNOLOGY — OPEN SOURCE

Introducing Container-Startup-Autoscaler for Kubernetes

Modify CPU and/or memory resources of containers depending on whether they’re starting up

Will Tomlin

Published in

Expedia Group Technology

3 min readApr 9, 2024

People walk on a suspended bridge over water — Photo by Desti Nursinta on Unsplash

The release of Kubernetes 1.27.0 introduced a new, long-awaited alpha feature: In-place Update of Pod Resources. This feature allows pod container resources (requests and limits) to be updated in-place, without the need to restart the pod. Prior to this, any changes made to container resources required a pod restart to apply.

A historical concern of running workloads within Kubernetes is how to tune container resources for workloads that have very different resource utilization characteristics during two core phases: startup and post-startup. Given the previous lack of ability to change container resources in-place, there was generally a trade-off for startup-heavy workloads between obtaining good (and consistent) startup times and overall resource wastage, post-startup:

Employ Burstable Quality of Service (QoS)

Set limits greater than requests in the hope that resources beyond requests are actually scavengeable during startup.

Startup time is unpredictable since it’s dependent on cluster node-loading conditions.
Post-startup performance may also be unpredictable as additional scavengeable resources are volatile in nature (particularly with cluster consolidation mechanisms).

Employ Guaranteed QoS (1)

Set limits the same as requests, with startup time as the primary factor in determining the value.

Startup time and post-startup performance is predictable but wastage may occur, particularly if the pod replica count is generally more than it needs to be.

Employ Guaranteed QoS (2)

Set limits the same as requests, with normal workload-servicing performance as the primary factor in determining the value.

Post-startup performance is predictable and acceptable, but startup time is slower — this negatively affects desirable operational characteristics such as by elongating deployment durations and horizontal scaling reaction times.

The Performance and Reliability Engineering team within Expedia Group™ Technology have recently open sourced container-startup-autoscaler (CSA), a Kubernetes controller built on top of Kubernetes’ In-place Update of Pod Resources feature that modifies the CPU and/or memory resources of containers depending on whether they’re starting up, according to the startup/post-startup settings you supply. CSA works at the pod level and is agnostic to how the pod is managed; it works with Deployments, StatefulSets, DaemonSets and other workload management APIs. CSA supports containers that are starting for the first time and those that are restarted by Kubernetes.

An overview of CSA showing when target containers are scaled

The core motivation of CSA is to provide Kubernetes workload owners with the ability to configure container resources for startup (in a guaranteed QoS fashion) separately from normal post-startup workload resources. In doing so, the trade-offs listed above are eliminated and the foundations are laid for:

Reducing resource wastage by facilitating separate settings for two fundamental workload phases.
Faster and more predictable workload startup times, promoting desirable operational characteristics such as the ability to horizontally scale faster.

Kubernetes’ In-place Update of Pod Resources feature is currently in alpha state as of Kubernetes 1.29, and therefore CSA operation requires the InPlacePodVerticalScaling feature gate to be enabled. The feature implementation, along with the corresponding implementation of CSA, is likely to change until it reaches stable status. Owing to this, CSA should currently only be used for preview purposes on local or otherwise non-production Kubernetes clusters.

A number of scripts are provided that allow you to try out CSA using a local Kind-based setup.

For more information on CSA, including a demo video, please see the project’s README.md.

Many thanks to:

Nikos Katirtzis for his diligent support whilst open sourcing CSA.
Kaushik Patel and platform reliability engineers for feedback and assistance in testing CSA for multiple architectures.

You may also be interested in Mittens, a tool that can be used to warm up HTTP applications over REST or gRPC.

Find out more about Life at Expedia Group.