3 solutions to mitigate the cold-starts on Cloud Run.

Published in

Google Cloud - Community

7 min readNov 9, 2020

Serverless paradigm, in its ultimate design, allows to pay only when you use the service. With Cloud Run, you pay only with a request is being processed. The rest of the time, you pay nothing. It’s the same with other services such as Cloud Functions, App Engine (standard), or even on other clouds Azure Functions or AWS Lambda.

To make this possible and sustainable, Cloud providers need to save the resources when the service is idle, and thus to stop the idle instances and, therefore, to scale down to 0.
From 0, when a new request comes in, your app need to be loaded on an instance, started/initialized, and then ready to serve the traffic. This init takes more time (the cold start) than already initialized instances (warm start).

How to mitigate or remove this cold start on Cloud Run?

I’m going to present possible 3 solutions with their tradeoffs.

Min instance feature

The first solution is a built-in and fresh new feature: Min instance. It allows to define a minimum number of instances kept warm even if the service doesn’t serve any requests, and therefore reduce/avoid the cold starts.

Because it’s a built-in feature, it’s easy and simple to use. Deploy your service with the --min-instances parameter with the gcloud CLI (in beta for now) or directly through the UI

This solution prevents the cold starts from 0 to 1 but also up to the min instances value. Therefore, above 1 instance required to serve the traffic, the scaling up is smoother.
If you set 3, that means that 3 instances are kept warm and idle, and up to the usage of 3 instances, there will be any cold starts due to new instance creation.

However, this feature has a cost: idle instances are billed when they don’t serve traffic.
When they serve traffic, standard rate is applied.

In short, the cost of idle instances:

~10% of the standard rate for the vCPU.
100% of the standard rate for the memory.

Scheduled polling request

The second solution uses Cloud Scheduler to periodically poll the Cloud Run service and to keep an instance warm.

This solution only prevents the cold starts from 0 to 1. In addition, it’s configuration only, no extra code required.

This solution is free (or at least very affordable) compared to the solution 1. However, its main issue is the possible “no-instance” period.
Indeed, even if you set the scheduler to poll every minutes (lowest periodicity with Cloud Scheduler), the current instance can be terminated just after the polling and during up to 60s, the users can get a cold start.

This solution is easy to configure:

Go to the Cloud Scheduler page
Create a new job
Name your Job, Select your frequency and paste your Cloud Run service URL. You can add the path that you want here
Then select HTTP as target and configure the HTTP Verb

That’s all!!
If you have a private Cloud Run service, an extra configuration is required

Click on SHOW MORE to expand authentication capacities
Select Add OIDC Token, then fill in your service account (it must have the roles/run.invoker granted)
Paste the Cloud Run service root URL (without extra path) in the audience field.

SIGTERM infinite loop

The third solution uses the Graceful termination feature. It allows the app to receive the SIGTERM notification, to catch it and to perform actions in the next 10s.

Even if the graceful termination period is short, it lets the time to perform actions, such as self calling the service! Like this,

When an instance is shutting down, a SIGTERM notification is sent
The notification runs the graceful termination process which calls the URL of the current service
If no other instance is currently running, a new one is created and started. Like this, there is always 1 instance warm to serve traffic (except during the cold start of the new instance creation). In summary, an infinite loop on SIGTERM notification.

This solution only prevents the cold starts from 0 to 1.

The users can wait longer if their requests are sent during the cold start period. The shorter your cold start, the less likely it is for users to send a request during a cold start period.
There is no long “no-instance” period as in the solution 2, and therefore less likely that a user request gets a cold start.
There is no cost as in the solution 1. The free tier include this process.
There is extra code to implement (~100 lines in Go)

Indeed, an instance doesn’t know how to call itself the service URL. We have to retrieve this information. For that, with need

The region where the service is deployed. We can get it from the metadata server which has a /instance/region endpoint. Note: this endpoint is specific to Cloud Run, it doesn’t exist for Compute Engine.
The project ID or the project Number, both are possible. By chance, the metadata server provides also the project Number, and in the same time as the region value.

func getProjectAndRegion() (prNb string, region string, err error) {
   resp, err := metadata.Get("/instance/region")
   if err != nil {
      return
   }
   // response pattern is projects/<projectNumber>/regions/<region>
   r := strings.Split(resp,"/")
   prNb = r[1]
   region = r[3]
   return
}

The service name. It’s provided by the Cloud Run default environment variable K_SERVICE

With these values, it’s possible to perform a call to the Cloud Run REST API namespaces.services.get and to get the URL of the service. For this, the Cloud Run service account must have the roles/run.viewer granted.

func getCloudRunUrl(region string, projectNumber string, service string) (url string, err error) {
   ctx := context.Background()
   client, err := google.DefaultClient(ctx)

   cloudRunApi := fmt.Sprintf("https://%s-run.googleapis.com/apis/serving.knative.dev/v1/namespaces/%s/services/%s", region, projectNumber, service)
   resp, err := client.Get(cloudRunApi)

   if err != nil {
     ...
   }
   defer resp.Body.Close()
   body, err := ioutil.ReadAll(resp.Body)
   if err != nil {
      ...
   }
   cloudRunResp := &CloudRunAPIUrlOnly{}
   json.Unmarshal(body, cloudRunResp)
   url = cloudRunResp.Status.URL
   return
}// Minimal type to get only the interesting part in the answer
type CloudRunAPIUrlOnly struct {
   Status struct {
      URL string `json:"url"`
   } `json:"status"`
}

With the Cloud Run service URL, it’s now simple to perform a call to this URL. In my case, I perform a GET on the root path /. But you can customize the path to call, and the HTTP method.
Note: even if you call a not existing path, and you get a 404 HTTP error, the service has been called and the instance created!

For a more generic solution, I implemented a service-to-service call. It’s useless for public Cloud Run service but it covers the case of private Cloud Run service.
In case of private Cloud Run service, the Cloud Run service account must have roles/run.invoker granted to be authorized to make a call to itself.

I also loop until I have a 2XX HTTP code. This loop can be infinite but, don’t matter, after 10s, the instance is killed!

func selfCall(url string) {
   tokenURL := fmt.Sprintf("/instance/service-accounts/default/identity?audience=%s", url)
   idToken, err := metadata.Get(tokenURL)
   if err != nil {
      ...
   }

   req, err := http.NewRequest("GET", url, nil)
   if err != nil {
      ...
   }
   req.Header.Add("Authorization", fmt.Sprintf("Bearer %s", idToken))
   for resp,err := http.DefaultClient.Do(req); err != nil || resp.StatusCode>=300; {
      fmt.Println("self call not successful, retry")
   }
   fmt.Println("Self call success. Goodbye")
   os.Exit(0)
}

You can find the full code on the GitHub repository
Fun fact: I created this zombie code the Halloween evening!

This code can be adapted in any languages. I will be happy to help you to achieve this.

Choose what you need

In the end, you have 3 solutions with very different constraints and expectations.

Min instance built-in solution is the most expensive, but also the entreprise grade solution with the capability to prevent cold start for more than only 1 instance only by one-click configuration.
Scheduled polling is very affordable and required only configuration with Cloud Scheduler. However, this solution covers only the case from 0 to 1 instance, and it’s also the solution that prevent the least well the cold starts.
SIGTERM management solution is the most balanced solution between the cost (the cheapest solution) and the cold starts prevention. However, as the previous one, this solution cover only the case from 0 to 1 instance and required code update (and maintenance, tests,…)

Depending on your needs, budget and skills, choose which one that best fits!

3 solutions to mitigate the cold-starts on Cloud Run.

Min instance feature

Scheduled polling request

SIGTERM infinite loop

Choose what you need

Written by guillaume blaquiere