Sticky Revisions for Knative Services

Olivier Tardieu
5 min readNov 8, 2022

--

Knative can split traffic among revisions of a service, for example routing only 5% of requests to the newest revision. For each incoming request, Knative rolls a dice. For a user of the service, this can be confusing as a series of requests may be spread across multiple revisions. In this blog, we examine how to make Knative revisions sticky. We want to make sure that once a revision has been randomly selected for the first request in a series, subsequent requests can be deterministically routed to the same revision.

The Case for Sticky Revisions

Knative Serving facilitates managing the lifecycle of a microservice on a Kubernetes cluster. A reconfiguration of a Knative service, for instance to deploy a new container image, creates a new revision of the service without replacing existing revisions. Because Knative can dynamically scale revisions to match load, keeping older revisions around by default does not typically waste resources.

Knative can split traffic among revisions of a service. Each request gets routed to a given revision with the probability specified in the service configuration. More precisely, Knative randomly chooses a revision for each request using the specified probability distribution. These random choices are independent from one another. Thanks to the law of large numbers, the actual distribution of requests will eventually converge to the desired one.

While canary testing is great to reduce the stress level of service maintainers, the experience is not always the best for the users. If 5% of the requests overall are routed to the newest revision of a service, then 5% of the requests of a particular user are routed to this revision. Suppose we are deploying a new UI with some tweaks. With Knative today this user will mostly experience the old UI but also the new one from time to time. Not only will the user’s experience be degraded, but this degraded experience will also negatively impact our ability to assess the reaction to the new UI.

To fix this issue, we need to coarsen the granularity of the routing decisions. In the latter example, we would like a user to consistently experience either the old or the new UI but not both at the same time. In other words, while we still want users to be split across revisions with some probability, once a revision has been chosen for a user, we should as much as possible stick to this choice.

The Approach

To make revisions sticky, we need to remember routing decisions somehow. Unfortunately Knative today does not remember these decisions, so we must remember them on behalf of Knative. Concretly we have to:

  1. observe Knative’s selected revision for the first request in a series,
  2. remember this revision until the series completes,
  3. ask Knative to route subsequent requests to the same revision.

For 1, we augment the request handler(s) in the service to append a header to the response with the name for the chosen revision. For 2, we augment the client code to extract and remember the header value. For 3, we alter the client code to ask for this specific revision in subsequent requests.

In order to make Knative accept for requests for a particular revision of a service, we enable a Knative extension named tag header based routing.

kubectl patch cm config-features -n knative-serving \
-p '{"data":{"tag-header-based-routing":"Enabled"}}'

Moreover, we tag revisions to ensure that each revision is tagged with its name. This is awkward but necessary because we cannot easily obtain the tag(s) of a running revision and we can only request a revision by its tag.

The Example

In this example, we implement in Go a minimal service that supports sticky revisions and a simple client that makes two requests to this service using sticky revisions to hit the same revision for both requests.

The server code sets a header with the revision name on the response.

Server code

The revision name is obtained by querying the environment variable K_REVISION that is set by Knative on every running pod.

For simplicity, we use the header name Knative-Serving-Tag, which is the named already used by Knative’s tag-based routing implementation.

The client code extracts the revision name from the response header for the first request and sets the same header on the second request.

`Client code

The url for the service to invoke is provided as a command-line argument.

To run this example, deploy two revisions of the service using the prebuilt container image for the server code:

kn service create demo --image quay.io/tardieu/sticky-rev \
--revision-name demo-rev1 --tag demo-rev1=demo-rev1
kn service update demo \
--revision-name demo-rev2 --tag demo-rev2=demo-rev2 \
--traffic demo-rev1=50 --traffic demo-rev2=50

We configure a 50/50 traffic split and make sure revision tags and names are consistent.

Download and run the client code replacing the service url http://demo.default.127.0.0.1.sslip.io with the correct one for your Kubernetes cluster.

curl -o client.go  https://gist.githubusercontent.com/tardieu/f63af75710297a2983ebf5889249c061/raw
go run client.go http://demo.default.127.0.0.1.sslip.io

You should see that the two requests have been handled by the same randomly-selected revision.

request 1 was processed by revision demo-rev2
request 2 was processed by revision demo-rev2

Wrap Up

Multiple revisions and traffic splitting are great features of Knative Serving. Making independent random decisions for every incoming request however is not always the best way to split traffic. In this blog, we have shown how to coarsen the granularity of the routing decisions to permit for example mapping users as a whole to revisions.

On the one hand, the tactic proposed here works with Knative today. On the other, it requires the server code, the client code, and the service configuration to be carefully crafted or altered to support sticky revisions. It also relies on revision tags, possibly precluding other uses for tags.

We believe sticky routing decisions are important. Sticky revisions is one use case. Session affinity, which allows every request in a series to be delivered to the same pod, is another. With the community, we are reviewing use cases and working a strategy to improve Knative support for sticky routing. We encourage you to jump into the discussion:

What do you think?

--

--