Go Serverless with Google Cloud Run Functions

George Mao
Google Cloud - Community
6 min readNov 12, 2024

--

Serverless development is my favorite way to build modern applications — with Google Cloud Run Functions, you bring the code and Google handles all of the heavy lifting of load balancing, scaling, availability, and Infrastructure/OS management. This means developers just focus on building awesome apps. Here is the first of a 3 part guide on Cloud Run Functions.

Cloud Run Functions is the only serverless FaaS product in the market that supports scale up and down from zero, can attach L4 GPUs on demand, and deploy to multi-regions with one command

The Benefits

  • You only pay for the resources you consume. You don’t need to worry about idle VM instances or optimizing your cluster sizes. You’re billed on a combination of Invocation count (first 2M / month are free) + resources allocated during theduration of execution — which we refer to as Vcpu-seconds and Gib-seconds.
  • There is no maintenance! Developers don’t need to debate Instance types, or configure VMs, and there is no Operating System to patch
  • You can operate nearly any workload, including deploying LLMs or running graphics intensive workloads using the new L4 GPU support

The Basics

Functions respond to Triggers —there are two types of triggers

  1. Event based triggers are an event generated by a GCP Service. Examples are EventArc, Pub/Sub, Cloud Storage, or Firestore. These services generate an event that results in an asynchronous invoke of your Cloud Run Function.
  2. The second type of event is a synchronous HTTP trigger. Functions are auto assigned a Function URL that is referred to as the HTTP trigger. You make a HTTP POST request to the url to invoke the function. By default, the URL is public and requires Authentication. In addition, you can restrict or disable external traffic. It’s always in this format:
https://REGION-PROJECT_ID.cloudfunctions.net/FUNCTION_NAME

Pro Tip: You can invoke your function for testing with a curl command:

curl -m 70 -X POST https://[region]-[project_id].cloudfunctions.net/[func_name] \
-H "Authorization: bearer $(gcloud auth print-identity-token)" \
-H "Content-Type: application/json" \
-d '{
"name": "Hello World"
}'

Security

Functions must assign a Service Account (SA). The attached SA provides an identity to all code running in the Function. By default, Functions will use the default compute service account, which has broad permissions. It looks like this:

PROJECT_NUMBER-compute@developer.gserviceaccount.com

You should always create your own SA and attach appropriate, scoped down permissions to specific resources your Function needs to access.

Function Resources

You configure the vCPU and Memory resources allocated to the function. You should choose settings that match your workload needs, since Google bills you based on resources allocated, not consumed.

Let’s write a HTTP Function

Here’s the basic required implementation for a HTTP function (NodeJs). Other language examples are available here. At minimum you need to register with the functions framework using the functions.http() method and then return a valid HTTP response. The req parameter will contain all information delivered as part of the HTTP request.

const functions = require('@google-cloud/functions-framework');

// Register an HTTP function with the Functions Framework that will be executed
// when you make an HTTP request to the deployed function's endpoint.
functions.http('helloGET', async (req, res) => {
// Retrieve information from the request
const name = req.query.name

// Return a valid HTTP response
res.send('Hello World!');

// Optionally, specify the status code and response type
// res.status(200).json(retVal);
});

Pro Tip: Change the function syntax to async so you can use top level await

Let’s write an Event Driven Function

You still have to register with the functions framework but this time use the functions.cloudEvent() method. Notice you do not return any values, since event triggers are an async process and do not expect a response.

The cloudEvent parameter will contain the entire payload for the event. Access the cloudEvent.data field to extract details specific to the event that occurred. The payload format for the data field will differ depending on the service that generated the event. For example, some services base64 encode this field but most services just write a JSON object. See the Google-Events repo to see all possible events and their format.

const functions = require('@google-cloud/functions-framework');

// Register a CloudEvent callback with the Functions Framework
functions.cloudEvent('helloEvent', cloudEvent => {
// Handle the Event here. The event is inside the cloudEvent.data field
const eventPayload = cloudEvent.data;

// If Cloud Storage, the payload is in JSON format
// const {cloudStorageEvent} = {eventPayload.name, eventPayload.bucket}

// If Pub/Sub, the data payload is base64 encoded
// const base64Event = eventPayload.message.data;
// const pubsubEvent = Buffer.from(base64name, 'base64').toString()

console.log(`Hello!`);
});

What is Concurrency?

There are two concepts you should be aware of when you think about function scaling. They are configurable and operate independently of each other.

Concurrent instances per function

Cloud Run Functions automatically scales the number of active containers behind the scenes to handle incoming work. This is the number of concurrent instances of your function that are active at any given time.

Functions transition through 4 main lifecycle stages: Starting → Active → Idle → Shutting down

  • When a Trigger invokes a function, Google will attempt to use an instance that is warm & available to serve the request. You can check if any are available using the Instance count metric and look for the idle count.
  • If no idle instances exist, Google will spin up an instance to serve the invoke — this is called a cold start because there is overhead to go from zero to active. The instance will remain active until all processing is complete.
  • After a function completes processing, it will transition to idle and wait for another invoke. Functions remain in idle state for up to 15 minutes then the instance is removed. At this point the instance transitions to Shutdown.
  • Here is an example of the Instance Count metric. At the highlighted time, there are 35 active instances processing workloads and 5 idle.

You can mitigate colds start by setting the Autoscaling configuration: minimum instances. You may also want to set maximum instances to control costs or protect downstream services from being overwhelmed.

Max Concurrent requests per function instance

Functions default to serving a single request at a time. However, you can increase this setting (up to 1000) to allow a single instance to serve multiple concurrent requests. This is referred to as concurrent requests per function instance.

1 concurrent request per instance vs 3 concurrent requests per instance
  • Serving multiple requests concurrently generally provides significant cost savings since the majority of cost comes from instance uptime.
  • This feature results in shared global memory and cpu consumption. You need to ensure your code is able to safely operate this way.
  • You can view the Max Concurrent Requests metric to understand how your functions are serving requests.

Pro tip: By default this metric shows a Distribution over 1 minute, which isn’t very useful. You can modify it to group by count which is more useful.

Group the metric by Count instead

Summary & TLDR;

When you transition from VM based scaling to Serverless scaling — Concurrency becomes the main scaling factor. Cloud Run has two layers of Concurrency that are critical to understand.

There also are other important Cloud Run Functions features I’ll cover in a future post but you can read through official docs now:

Let me know if I can help you in your Serverless journey on Google Cloud!

--

--

Google Cloud - Community
Google Cloud - Community

Published in Google Cloud - Community

A collection of technical articles and blogs published or curated by Google Cloud Developer Advocates. The views expressed are those of the authors and don't necessarily reflect those of Google.

George Mao
George Mao

Written by George Mao

Head of Specialist Architects @ Google Cloud. I lead a team of experts responsible for helping customers solve their toughest challenges and adopt GCP at scale

Responses (1)