Cloud Functions Best Practices (2/4) : Optimize the Cloud Functions
Code, call, structure your function code perfectly
This article is part of a 4 articles serie in which I give various advices about Google Cloud Functions development. This work is a result of two years of daily practice, deployment and monitoring. Some of those best practice are directly from the official documentation, others are from my experience, what was proven to be the most effective. For any different point of view, feel free to comment this (free) article. Thanks!
<<< Cloud Functions Best Practices (1/4) : Get the environment ready
Cloud Functions Best Practices (3/4) : Secure the Cloud Functions >>>
Cloud Functions Best Practices (4/4): Monitor and log the executions >>>
Optimize the Cloud Functions
Online tutorials are cool, mine are even super cool! ;) With a bunch of articles, anyone can learn how to use Secret Manager in Google Cloud Functions, how to protect a Google Cloud Functions, how to rate limit them, how to use Cloud Storage, Cloud Pub/Sub and every possible tools.
But,
It has a major drawback ⇒ Tutorials are made for a single, short, specific topic.
And business projects are everything but short and specific.
Business projects combine multiple Cloud Functions, using various tools that need to be tried, tested, and debugged, hopefully efficiently.
How to do that is not explained in single-feature tutorials.
That’s what I cover in this second article ⇒ Optimize the Functions code to allow multi Cloud Functions management, fast invocations, secure deployment and efficient testing & debug.
Split and delegate
After a few months of Google Cloud Functions utilisation, there might be some copy-paste between functions: sending messages to Slack, sending emails using Gmail, using Google Sheets, logging to BigQuery and many more use cases.
What if Slack does an update of their API? What if a new parameter is required by an API used into a bunch of Google Cloud Functions?
That would be hell to update all of them, redeploy and retest them.
What I highly recommend is to create generic functions for specific tasks.
The architecture change will be the following:
I agree… seeing it that way looks more complex, more cloud functions to create, deploy, monitor.
But the point here is to create a specific generic function per API used, if the API is likely to be reused in future cases.
What about that way? >>>
Imagine… you have to change 1 little parameter of an API on the left configuration… Good Luck! :D
The architecture on the right have many advantages:
- The code is not repeated, facilitating any update, maintenance, and tests (kind of DRY)
- Functions are doing less tasks, less reason to crash, facilitating the debugging, 1 function, 1 task
- It’s no more needed to import slack, gmail and multiple packages into a functions
- Because no import are needed, cold start time and execution time are faster, as well as a reduction in CPU usage
- It’s a piece of cake (3 lines) to link a generic Cloud Function via Pub/Sub to another Cloud Function, testing is almost not required if you want to add a Slack notifier or Gmail sender…
But, how do I do that?
First, create a generic Cloud Functions (a Slack publisher in this example):
from slack import WebClient
import base64
import json
# ...
def slack_connection():
try:
client = WebClient(token=SLACK_TOKEN)
except Exception as e:
print('End-Slack Error', e)
return
return client
def main(event, context):
event_data = base64.b64decode(event['data']).decode('utf-8')
req = json.loads(event_data)
text, channel = req.get('text', ''), req.get('channel', '')
#
# Send message to slack...
#
return 1
And deploy it as a background function:
gcloud functions deploy generic_slack --region=europe-west2 --entry-point mail --runtime python310 --trigger-topic='generic_slack'
Now, we have a Cloud Functions, taking a text and a channel attribute, publishing to Slack.
Having it as a background functions protect it from being accessed by everyone.
Then, in any Cloud Functions, we simply need to add these lines:
from google.cloud import pubsub_v1
import json
TOPIC_SLACK = 'projects/YOUR-PROJECT/topics/generic_slack'
publisher = pubsub_v1.PublisherClient()
def publish_slack(text):
body = {
"text": text,
"channel": "#random"
}
publisher.publish(TOPIC_SLACK, json.dumps(body).encode('utf-8'))
#
# ... inside the main function
#
publish_slack("Hello, I'm generic, light, efficient and easy to send!")
At the beginning of this snippet, we import PubSub, create a generic function (of a generic Cloud Function yeah) and then just call this function with our custom text.
You want to add a username, or a custom emoji?
Nothing easier, we just have to go back to our Slack Cloud functions, update the code and the input and we are good to go! :)
Cache smartly
Cloud Functions are stateless, and that’s why we like them!
It means that between two invocations, nothing is shared : global variables, memory, file systems, or other state.
…
Nothing?
…
Well, starting a new function will load a lot of things : runtime, packages, your code.
This is called a function instance.
This loading is taking a few seconds and is called a “Cold Start”, the time needed to warm up the function.
This function instance won’t disappear right after the function complete. It will stay “warm” in the cloud for a little while (between 5-15 minutes if you’re lucky) to be reused by future invocations.
This magic allows two things :
- Reducing cold start, as the function is already set (See my previous article Reduce cold start and execution time of Google Cloud Functions)
- Reusing global variables. Global variables can be instantiate in the global scope ⇒ outside any functions, at the very beginning of the file. Its value can then be reused in subsequent invocations without having to be recomputed.
Using global scope will cache values from heavy computation processes and import, this is particularly useful for database connection so the function doesn’t have to connect at every invocation.
Concretely, it’s good to use caching for the following purposes:
- Accessing Google APIs (like Pub/Sub, Secret Manager, BigQuery…)
from google.cloud import pubsub_v1
publisher = pubsub_v1.PublisherClient() # Publisher client is outside the main function to be reusable
def main(request):
# ... Your function code
publisher.publish(TOPIC, json.dumps(payload).encode('utf-8'))
# ...
- Database connection
- Packages import
There could be seconds of differences between a good use of caching or not.
Be a lazy importer
Being a lazy importer will reduce cold start time.
The general concept is to import only the package that will be used during Cloud Functions execution.
It means… importing package inside the main function.
This is not a common pattern, frankly, it’s ugly. A good tech lead would refuse a PR with that kind of imports, but for performance reasons, it’s cool!
Lazy import answers the question : Why should we import packages that might not be used?
Packages and dependencies are the #1 contributors to GCF cold-boot performance.
We should avoid them at maximum and more important, avoid them in global scope if it’s not always used.
First, only the dependencies that are used must be imported => if there is a specific function that is used from a dependency, import this very specific function.
Secondly, if dependencies are used is some specific paths, they should be imported inside these paths. This isn’t a standard practice but it can save some precious milliseconds of the Google Cloud Function cold start.
I would say that if a package is used in less than 75% of the cases, lazy import makes sense.
For example, if a bank wants to notify clients after a withdrawal of more than 1000 euros, this is not common, the Cloud Functions might not need it at every invocation, maybe 10% of the case, the following code makes a lot of sense:
Don’t create files, or delete them
We saw previously that a function instance could be recycled. Global variables, packages can persist between invocations.
And so does a file!
A file created into a Google Cloud Functions consume the memory available to the function. This memory is not unlimited (like the planet’s ressources hehe).
Creating many files in a Cloud Function may lead to an out-of-memory error and a subsequent cold start LINK.
Actually, there aren’t many reasons to create a file into a Cloud Functions… Except for bizarre cases imposed by an API provider.
Delete files right after the creation or before the end of the function.
Google Cloud Storage is here for file related cases and can handle file creation, file transfer, file transformation. See an example below:
from google.cloud import storage
import pandas as pd
# Create storage client
client = storage.Client()
bucket = client.get_bucket("any_bucket_name")
final_df = # any pandas df
blob = storage.Blob("file_name.csv", bucket)
blob.upload_from_string(final_df.to_csv(header=False), 'text/csv')
That way, a csv file would be created directly inside a Cloud Storage bucket!
Use minimum and maximum instances
I continue this Best Practices article with a feature introduced in mid-2021.
Min instances and max instances.
Min instances
I guess this feature is straightforward, it keeps X instances warm even if there are no requests for a while.
Doing so, there are no cold start anymore if min-instances is set > 0!
However, I will add a few clarifications:
- Minimum instances is keeping X instances warm, X being the number of instances that was filled in the UI or using the
-min-instances
gcloud flag ⇒ Meaning, if more than X functions were called at the same time, the additional n functions will need to warm up, there will still be some cold starts. - There aren’t hard guarantees about the behaviour. In the official documentation they state:
Cloud Functions **attempts** to keep function instances idle for an unspecified amount of time after handling a request.
- There will still be a cold start right after deployments and after crashes
This feature comes at a cost. It’s almost like paying the price of a Cloud Function running 24/7 (100% of the GB-Second price). See pricing. For low memory Cloud Functions, it’s cheap, but if the function needs a lot of memory, there might exist cheaper solutions.
I would definitely recommend to set up a minimum instance value for user facing functions.
Max instances
Max instances is different from its brother, it has nothing do to with cold start.
Max instances is used to control the scaling behaviour of the Google Cloud Functions.
Even though the concept sounds easy, I will add a few more insights:
- Max instances goal is to protect a Cloud Functions. By setting this limit, developers are assured that there won’t be more Cloud Functions running than the limit (it can slightly exceed this limit in case of traffic spikes).
- It’s essential for every Cloud Functions… essentials for every Cloud Functions! With no limit specified, a Cloud Function can scale to 1 000 instances within a few seconds, cost an enormous amount of pizzas if the function runs for hours. Set a value, even if large.
- It’s even more essentials for Cloud Functions accessible from anywhere via HTTPs (protection against DDoS attack)
- It’s even more than even more essentials for Cloud Functions that access APIs. For two reasons: first, an external API key used by the Cloud Functions can be blocked if it exceeds its own limitations. Second, if the function is accessing a costly service like an SMS sender like Twilio… mama mia you goofed up the job! Also think about Rate limiting your Google Cloud Functions.
With max-instances set, the error “POST 429 […] The request was aborted because there was no available instance” will occur.
This issue is because Cloud Functions received more requests than available instances.
One instance can process one request at a time and one request only, the number of simultaneous requests should not exceed the max-instances value.
If it does, hum, the request will be kept warm in a comfy shoe box for 30 seconds, waiting for an available instance.
If no instance becomes available, the request will fail.
HA!
So what?
The error POST 429 […] The request was aborted because there was no available instance have 3 emergency solutions (inspired by Guillaume Blaquiere answer on Stackoverflow):
- Increase the max-instances value
- Enable Cloud Functions retries, also possible using Pub/Sub subscriptions which gives the ability to do exponential retries with a great tuning
- Use a product that accept parallel processing like Cloud Run
Keep in mind that the scaling can take time, this error could arise while scaling up.
To learn more about this part:
https://cloud.google.com/functions/docs/configuring/min-instances
https://cloud.google.com/functions/docs/configuring/max-instances
Use native Secret Manager implementation
What if we have variables shared by more than one function?
And what if these variables are kind of sensitive like credentials?
What if we have to change a variable/API key/ID that is used in many Cloud Functions?
IF YOU WERE USING ENVIRONMENT VARIABLES TO STORE SECRETS : That’s a very bad practice, KEEP READING
There exist a service proposed by Google Cloud called Secret Manager.
Google Secret Manager is a fully-managed, central, secure, and convenient storage system for secrets and any kind of keys.
Changing a secret in Secret Manager will change it everywhere. More important than that, access can be controlled and access rights can be managed independently.
For many reasons, using native implementation of Secret Manager should be preferred, but for many reasons I don’t want to repeat myself here, so for many reasons I am redirecting you to a previous article doing a comparison in terms of usability, performance and security:
Bye
I hope this article brings your Cloud Functions development to the next level.
Previous part:
<<< Cloud Functions Best Practices (1/4) : Get the environment ready
Next parts:
Cloud Functions Best Practices (3/4) : Secure the Cloud Functions >>>
Cloud Functions Best Practices (4/4): Monitor and log the executions >>>
Thanks for the 15 claps!