Traffic Routing in Cloud Run

Sushil Kumar
Google Cloud - Community
7 min readJul 30, 2022
Photo by Denys Nevozhai on Unsplash

Cloud Run is Google’s Fully Managed Compute offering to run container workloads on Google’s elastic compute infrastructure. It allows you to run your containers without the overhead of provisioning and managing underlying infrastructure.

I wrote about Cloud Run when it was first launched in Cloud Next 2019. Since then lot of features have been added for making it more robust and production ready.

In this post we’ll talk about the Traffic Routing features that Cloud Run provides to route traffic between different revisions of a Cloud Run service. However before we dive into traffic routing, first lets take a look at some basic terminology in Cloud Run vocabulary.

  1. Service — The construct represent your deployed container. There is one to one mapping between the service you are deploying and cloud run service.
  2. Revision — An immutable version of your container running under your service. Each deployment of your service creates an immutable revision which is capable of serving your traffic. Each Service can have multiple revisions.
  3. Instance — An instance is one or more containers which runs under a revision. This is the unit which serves your incoming requests. If Auto-Scalding is enabled each Revision can have multiple container instances.

Both Service and Revision are logical constructs where as Instance is the unit which actually serves the traffic.

Below is a diagrammatic representation of these concepts.

Service, Revision and Instances in Cloud Run

Now let us understand the example app that we will use to understand the traffic routing concepts.

  1. We will use a dummy Node.js application which send the “Hello World” response with the version number of the image.
  2. We’ll create two versions of the image and deploy them as two separate revisions.
  3. Then we will understand different constructs available in Traffic Routing.

Demo Node.js Application

This sample application uses the VERSION environment variable to differentiate between different running containers. We will deploy three revisions with different VERSION variable value and then understand the different traffic routing approaches present.

You need to publish this image to Artifact Registry or Google Container Registry (GCR) to be able to use with Cloud Run. In this post I’ll push it to GCR.

To build and push the image use below commands.

docker build -t gcr.io/<PROJECT-ID>/node-demo:1.0 .
docker push gcr.io/<PROJECT-ID>/node-demo:1.0

Let us deploy the revisions now. First we’ll deploy a single revision with VERSION=VERSION-1 . Then we will test that our service is working and serving requests.

We will then go on to deploy two more version with VERSION set to VERSION-2 and VERSION-3 .

Setup

Deploy the first revision

gcloud run deploy node-demo --image gcr.io/neural-land-324105/node-demo:1.0 --set-env-vars VERSION="VERSION-1" --region europe-west1 --allow-unauthenticated

Once deployed you’ll see following output with URL of the revision deployed.

Output after deploying first revision

You can also check the status of your service in Cloud Run dashboard.

node-demo service in Cloud Run dashboard

Now let us check if our service is able to serve requests.

curl <YOUR-SERVICE-URL>

It should return following response. You can also try this in browser instead of curl.

Hello world from VERSION-1

Deploy the second revision

Do the same exercise for VERSION-2

gcloud run deploy node-demo --image gcr.io/neural-land-324105/node-demo:1.0 --set-env-vars VERSION="VERSION-2" --region europe-west1 --allow-unauthenticated

Once you deploy revision 2 you’ll notice following in your output.

Service [node-demo] revision [node-demo-00002-law] has been deployed and is serving 100 percent of traffic.

This means whenever you deploy a new revision, 100% of the traffic is shifted to the latest revision. This is the default behavior of Cloud Run. You can also go to your service description and check all the deployed revisions.

Revision history after deploying VERSION-2

This also confirms that currently 100% traffic is being served by revision 2. If you send a request to the service now, you’ll get VERSION-2 in the output.

curl <YOUR-SERVICE-URL>
Hello world from VERSION-2

Deploy the third revision

Now following similar steps deploy VERSION-3 and entire traffic will shift to that.

gcloud run deploy node-demo --image gcr.io/neural-land-324105/node-demo:1.0 --set-env-vars VERSION="VERSION-3" --region europe-west1 --allow-unauthenticated

Send a request again to confirm that VERSION-3 is serving the traffic.

curl https://node-demo-d4z7wferna-ew.a.run.app
Hello world from VERSION-3
Revision history after deploying VERSION-3

Traffic Routing Approaches

Now we have all of our versions deployed so let us see different ways in which traffic can be routed to these revisions.

1 . Rollback to previous version

First scenario is where we wish to rollback to a previous version in case the newer version has a bug. For example, let us try to rollback to VERSION-2 .

gcloud run services update-traffic node-demo --region europe-west1 --to-revisions node-demo-00002-law=100

Make sure to change the revision name according to your deployment. Once traffic is updated, you can see in the revisions page that 100% traffic has been shifted to VERSION-2 .

Revision history after rollback to VERSION-2

You can also confirm this by sending a request and checking the response.

curl https://node-demo-d4z7wferna-ew.a.run.app
Hello world from VERSION-2

2. Splitting Traffic amongst multiple versions

In this scenario we can route the some part of the traffic to all the version. For Example, Lets serve 50% traffic to VERSION-3 and 25% each for VERSION-1 and VERSION-2 .

gcloud run services update-traffic node-demo --region europe-west1 --to-revisions node-demo-00001-tov=25,node-demo-00002-law=25,node-demo-00003-koj=50

You can check that the traffic has been updated to reflect the configuration.

Revision History after splitting traffic

One thing to note however is that this traffic split is not very accurate on smaller number of requests, but over a large number of requests it will follow approximately this distribution.

Let us check the results for 10 requests.

for run in {1..10}; do curl https://node-demo-d4z7wferna-ew.a.run.app; done
Hello world from VERSION-1
Hello world from VERSION-3
Hello world from VERSION-3
Hello world from VERSION-3
Hello world from VERSION-3
Hello world from VERSION-2
Hello world from VERSION-2
Hello world from VERSION-3
Hello world from VERSION-3
Hello world from VERSION-1

Now lets run it for 100 requests and see the distribution.

for run in {1..100}; do curl https://node-demo-d4z7wferna-ew.a.run.app >> output.txt; done
cat output.txt | grep VERSION-1 | wc -l
cat output.txt | grep VERSION-2 | wc -l
cat output.txt | grep VERSION-3 | wc -l
25
23
52

Now you can see we are almost converging to the distribution of 25%, 25% and 50%.

3. Gradual rollout from old revision to new revision

This scenario is a special case of previous (Splitting Traffic) scenario.

Before starting this example let us first move all the traffic to latest revision.

gcloud run services update-traffic node-demo --region europe-west1 --to-latest

Notice the --to-latest flag. This instructs Cloud Run to move 100% traffic to the latest revision, which in our case is VERSION-3 .

Now let us deploy a VERSION-4 and not move any traffic to this version on deployment. Then gradually move traffic from old revision to new revision

gcloud run deploy node-demo --image gcr.io/neural-land-324105/node-demo:1.0 --set-env-vars VERSION="VERSION-4" --region europe-west1 --allow-unauthenticated --no-traffic

Notice the --no-traffic flag, it instructs Cloud Run to deploy the revision and not change the routing configuration.

You’ll also see following output.

Service [node-demo] revision [node-demo-00006-roh] has been deployed and is serving 0 percent of traffic
Revision history after deploying VERSION-4

Move the traffic gradually to VERSION-4 now. For start we’ll move 5% of the traffic.

gcloud run services update-traffic node-demo --region europe-west1 --to-revisions node-demo-00006-roh=5

This will move 5% traffic to VERSION-4 and the remaining 95% will stay with VERSION-3 .

You’ll see the following output.

Traffic:
95% node-demo-00003-koj
5% node-demo-00006-ro
Revision history after gradual shit to new version

You can move the rest of the traffic in gradual increments using the same command but updating the percentage each time. Once you are satisfied with the new version’s performance you can migrate 100% traffic to the latest version.

And that’s the different Traffic Routing strategies that Cloud Run provides. You can use these strategies to implement A/B testing, or Canary deployment patterns.

If you find any bug in my code or have any questions in general, feel free to drop a comment below.

Till then Happy Coding :)

--

--

Sushil Kumar
Google Cloud - Community

A polyglot developer with a knack for Distributed systems, Cloud and automation.