A Journey Into Google Cloud Functions

Adrien Atallah
Policygenius
Published in
6 min readJul 10, 2019

On Policygenius’ engineering team we are always considering what is the best tool for the job. This means that our team is often learning and evaluating different cutting edge tech, one of my favorite parts of working at Policygenius.

Of course, the tech must serve a purpose and for this journey that purpose was to build a system that can automate tedious and repetitive tasks done manually by our in-house operations team.

For example, when a life insurance client has a medical exam scheduled, an operations team member would manually reach out to the client via our in-house-built CRM with a friendly reminder about the upcoming exam. There are many other examples of daily tasks like this that do not always require human, manual intervention; they take time out of the busy agent’s day and may not always be informing the client as soon as the update happens.

These ‘tasks’ are almost always very data-driven, they can be clearly defined in the form of a response to one or multiple data updates in our databases. So the system we need to automate these tasks should be an event-based, data-driven system that can process updates to our database in real time and perform actions in our other systems based on those updates. This is how we started looking at Google Cloud Functions.

Our High Level System

What is a Cloud Function?

A Cloud Function is a ‘serverless’ function, meaning that you can write the function and deploy it to Google Cloud without managing any servers yourself. They are usually single purpose and are perfect for an event-based system that can scale. If you’re familiar with AWS lambdas, Cloud Functions are Google’s version.

How are we using them for this project at Policygenius?

We have set up hooks on our data models that fire off messages to a publish/subscribe queue that a Cloud Function will subscribe to and process.

The Cloud Function will then handle the logic appropriate to process that update, which could be triggering an event in a different service.

This is a simplified version of one way we are using Cloud Functions, but it illustrates why Cloud Functions are a great fit for event-driven features.

Let’s look at a simple hello world Cloud Function.

hello_world.py

def hello_world:
print("hello world")

If we define this function to have an http trigger, Google will assign it an endpoint that can be triggered with an http post request. That endpoint will look something like

https://example-v1.cloudfunctions.net/hello_world

If we were to make a post request to this endpoint, we will see hello world in the Cloud Functions log. These definitions are defined manually using the GCP dashboard, or with a deployment/build configuration.

Deployment:

We are using Google's cloud build system with custom build steps defined in a cloudbuild.yaml file that triggers a build by a git commit.

We have a build step to run tests.


- name: ‘gcr.io/google-appengine/python’
entrypoint: ‘/bin/sh’
args: [
‘./bin/run_tests.sh’
]

Followed by a build step to deploy each function.

- name: ‘gcr.io/cloud-builders/gcloud’
args: [
‘functions’,
‘deploy’,
‘my_function_http’,
‘ — project=$PROJECT_ID’,
‘ — trigger-http’,
‘ — entry-point=my_function_http’,
‘ — runtime=python37’,
]
dir: ‘./functions’

The function above, my_function_http has the — trigger-http flag set, which means that this function will be triggered by an http call that will then execute the function my_function_http defined as the entry point.

A function like the one in the diagram above, triggered by a pub/sub message would be defined like:

- name: ‘gcr.io/cloud-builders/gcloud’
args: [
‘functions’,
‘deploy’,
‘my_function_pubsub’,
‘ — project=$PROJECT_ID’,
‘ — trigger-topic=my-pubsub-topic’,
‘ — entry-point=my_function_pubsub’,
‘ — runtime=python37’,
]
dir: ‘./functions’

Notice above, the dir argument is the same for both functions. The directory structure of our repo of Cloud Functions is structured like:

.
├── bin
│ ├── run_tests.sh
│ ├── setup
│ └── test
├── cloudbuild.yaml
├── functions
│ ├── my_function_http.py
│ ├── my_function_pubsub.py
│ ├── common
│ │ ├── cloud_func_logger.py
│ │ ├── __init__.py
│ │ ├── trace_func.py
│ │ └── tracer.py
│ ├── config
│ │ └── index.yaml
│ ├── requirements.txt

├── README.md
└── tests
├── requirements.txt
└── unit
├── common
│ └── test_trace_func.py
├── test_my_function_http.py
├── test_my_function_pubsub.py
└── utils
├── encode_data.py
└── struct.py

We deploy the whole ./functions directory on each function deploy step. The primary motivation to do this was so we could have shared utilities between different functions and not have to duplicate a lot of code. One downside to having this structure is that all of the code and files is deployed to each Cloud Function.

Testing:

Let’s look back at the first step in cloudbuild.yaml

- name: ‘gcr.io/google-appengine/python’
entrypoint: ‘/bin/sh’
args: [
‘./bin/run_tests.sh’
]

This build step is pulling down googles official app engine python image from GCR (Google Cloud Registry) and running a single bash script ./bin/run_tests.sh. This script would look something like this:

#!/bin/sh
echo “Running tests script..”
echo “Installing requirements..”pip3 install -r functions/requirements.txt
pip3 install -r ./tests/requirements.txt
echo “Running pytest.. “
pytest

We install the required dependencies and run pytest on everything. This way if any of our unit tests fail, pytest will throw an exception and our build with fail, as we want it to.

Logging & Monitoring:

Google’s ‘cloud computing systems management service’ is Stackdriver. The tools from Stackdriver that we explored using were Monitoring, Trace, Logging, and Error Reporting. We experimented with Trace and ultimately didn’t really use it. Google Cloud functions integrate with Logging and Error Reporting by default. Logging is your standard logs for each Cloud Function and Error Reporting will register Errors in the logs given that the error fulfills certain criteria like containing a stack trace or being manually formatted.

Google’s Stackdriver docs:

Log entries in Stackdriver Logging that contain stack traces or exceptions, or that are formatted like ReportedErrorEvent, generate errors in Stackdriver Error Reporting.

We noticed errors in our logs that were not showing up in error reporting. For example, an out-of-memory error doesn’t have a stack trace and was not showing up in error reporting for us.

Monitoring is a separate dashboard all together, which is linked to from the standard Google Cloud dashboard. Our primary use of monitoring is it’s alerting functionality, which we’ve configured with a slack integration. To set up an alert, you use the dashboard to create a policy.

One important thing to note is that these alerts are only metric-based, so they won’t show a stack trace or any details about the log entry, it will only show that there was at least one entry of a cloud function log with severity Error for example. The more useful policies we have set up are around memory usage and execution time, two critical values for Cloud Functions. Since a Cloud Functions allocated memory is capped to a pre-defined value given as part of its build step, knowing if a certain function is running at or near that capacity is very important to catch before it kills the functions execution.

Takeaways

Cloud Functions are an excellent tool for processing in an event-based, data-driven system. Google Cloud Platform has a decent suite of tools built around cloud functions:

  • Pub/Sub
  • Stackdriver Monitoring, Logging, Error Reporting
  • Cloudbuild

A lot of these tools are still very new and may be light on the documentation at times, but we found them to all integrate nicely with one another.

At Policygenius, we’re always looking for new technologies that help people get the financial protection they need. If that sounds interesting to you, head over to our careers page.

--

--