The Misadventures of One Cloud Function

Natalie Godec
Google Cloud - Community
6 min readFeb 5, 2021

“It WORKS!!”

You know that sweet, sparkling feeling when something you’ve been debugging for days, finally works?

As a platform engineer who accidentally became the resident GCP expert, I’ve been working closely with our data engineers throughout the last year. I help with their architecture, define — or help them define — infrastructure-as-code, remind them to not test in production and to consider security. Speaking of which…

The more we go into cloud technologies, embrace microservices and venture into the land of serverless, the more data is floating around, ready to be used - or breached.

For those of us whose workloads are entirely cloud-native, the data analytics landscape is bright and full of wonders: AWS Lake Formation and Redshift, GCP Dataflow and BigQuery, pockets of ML and NLP tools - everything to build a data platform suited just for your company.

And so much to secure.

Google Cloud Platform (GCP), offers an extensive set of services geared towards data engineering; and a broad range of security tools you can use to protect that precious data.

You might want to use DataFlow to consume topics from Kafka, store it in GCS or Big Query, run data transformations with DataProc or Cloud Composer, enable data science and machine learning with AI Notebooks and build beautiful, insightful dashboards in Data Studio.

Here is how such architecture might look:

Infrastructure diagram of a data pipeline built with Kafka and GCP data servcies
Example of a data pipeline built using GCP services

“But Natalie,” you say, “most of those services are public APIs, how do you make sure an intruder — or a malicious insider — can’t leak sensitive data from your Big Query datasets?”

VPC Service Controls

Security in the cloud has been, probably, the most heated argument that cloud providers have to prove. Today we have a wide selection of security controls at our disposition, and on the data protection side GCP provides us with something called VPC Service Controls.

VPC Service Controls is a set of tools which you can apply to restrict access to Google’s services within your GCP projects. You can think of it as a firewall on steroids: you define which projects you want to protect by putting them into a Service Perimeter; decide which APIs you want to restrict, and who can access those APIs from where.

VPC Service Controls protect your data by restricting Google APIs within a Service Perimeter
VPC Service Controls protect your data by restricting Google APIs within a Service Perimeter

When using a data platform similar to the one above, the list of APIs you want to protect would include:

[“bigquery.googleapis.com”,
“storage.googleapis.com”,
“logging.googleapis.com”,
“monitoring.googleapis.com”,
“notebooks.googleapis.com”,]

Restricting an API means that the resources that you create in your protected project will no longer be accessible from the Internet via the usual DNS address; those requests will instead be redirected to a dedicated domain — restricted.googleapis.com, which, in turn, points to a set of special IPs that Google manages.

You then define access controls, called Access Levels, where you will typically list the internal IPs, corporate device settings, or some special service accounts. Only requests that match at least one of those filters will reach the protected resources.

Can we use Cloud Functions now?

Of course! Serverless is all the craze these days, what’s your use case?

Let’s say we have an external supplier submitting reports to a Storage Bucket. You can create a Cloud Function that will listen to a particular path in the bucket, and once a file is written, will read the data and write some transformed version of it to a BigQuery table.

You will need to run the function as a special service account you create for it, as well as protect the Cloud Functions API by adding it to the list of restricted APIs in the service perimeter.

Then, usually, services running in the same GCP project should be able to communicate just fine…

ERROR: VPC Service Controls: Request is prohibited by organization’s policy.

Ah, right, okay — perhaps we need to add the Compute Engine default service account to the Access Level; try again?

ERROR: VPC Service Controls: Request is prohibited by organization’s policy.

Hmm. RTFM! Google uses a special service called Cloud Build that will build your function in a separate, invisible, so-called “shadow” project, then make it available for execution in your actual project. GCP uses a lot of such shadow projects to do all sorts of magic. It makes a lot of managed services very easy to use, but also doesn’t help debugging.

Anyways, let’s add the Cloud Build service account into the service perimeter’s access level.

ERROR: VPC Service Controls: Request is prohibited by organization’s policy.

Okay okay. Let’s see whether the logs will give us some clarity. Which principal is being passed with the request?

“authenticationInfo”: {
“principalEmail”: “special-service-account@your-project.iam.gserviceaccount.com”,
“serviceAccountDelegationInfo”: [
{
“firstPartyPrincipal”: {
“principalEmail”: “service-098765432345@gcf-admin-robot.iam.gserviceaccount.com”
}
}]
},

Now THAT is confusing. The function is running as the designated service account, but there is also a firstPartyPrincipal, which is neither the Compute Engine, nor the Cloud Build account?!

The solution

Now that we have sufficient data to analyse how Cloud Functions actually call to other GCP services, we know what we need to do:

  1. Restrict Cloud Functions API within the service perimeter
  2. Add Cloud Functions, Compute Engine, and Cloud Build robot service accounts to the access levels
  3. Add the runtime service account that we create for the function to the access levels

Having to manage all those service account lists can become cumbersome. While the robot SAs always follow the same naming pattern and can be predicted based on project numbers, the arbitrary custom service account that we create per function — cannot.

The whole reason for having a separate service account per function is security: the default Compute Engine SA gets the Editor role on the project, which is way too open.

The compromise here is to create a Cloud Functions SA dedicated to that GCP project, as opposed to a single Cloud Function. If you follow the architecture of purpose-dedicated projects, and the risk of a single SA accessing all the data in its project is acceptable, then you can easily automate the whole setup! Your Google Cloud data platform will be secure, yet functional and providing for a better user experience.

Here is how it can be achieved with Terraform.

  1. Package all the resources for setting up a project into a module
    Using modules makes your code reusable!
module "your-project" {
source = "../shiny-modules/gcp-project"
name = "your-project"
# we will base certain resources on this list of services
services = ["bigquery", "cloudfunctions"]
}

2. GCP-project module

# Add a label if Cloud Functions are requested — we will use it laterlocals {
cloudfunctions = contains(var.services, “cloudfunctions”) ? “enabled” : “disabled”
labels = merge(var.labels, {
“cloudfunctions” = local.cloudfunctions
})
}
resource “google_project” “project” {
name = var.name
labels = local.labels
}
# Create a project-dedicated SA to use in Cloud Functions
# A standard SA name will make the setup automatable
resource “google_service_account” “sa” {
count = contains(var.services, “cloudfunctions”) ? 1 : 0
account_id = “cf-runtime”
display_name = “Cloud Functions runtime service account”
}

3. VPC Service Controls config — make it auto-deploy after the projects are deployed ;)

# Find out all project numbers where Cloud Functions are enableddata “google_projects” “cf_enabled_prj” {
filter = “labels.cloudfunctions=enabled lifecycleState:ACTIVE”
}
data “google_project” “cf_enabled_prj_numbers” {
count = length(data.google_projects.cf_enabled_prj.projects)
project_id = lookup(
data.google_projects.cf_enabled_prj.projects[count.index],
“project_id”
)
}
# Compile a list of SAs you need to add to the access levelslocals = {
cf_sa_list = formatlist(
“serviceAccount:service-%s@gcf-admin-robot.iam.gserviceaccount.com”, data.google_project.cf_enabled_prj_numbers.*.number)
cloudbuild_sa_list = formatlist(
“serviceAccount:%s@cloudbuild.gserviceaccount.com”, data.google_project.cf_enabled_prj_numbers.*.number)
cloudbuild_agent_sa_list = formatlist(
“serviceAccount:service-%s@gcp-sa-cloudbuild.iam.gserviceaccount.com”, data.google_project.cf_enabled_prj_numbers.*.number)
runtime_cf_sa_list = formatlist(
“serviceAccount:cf-runtime@%s.iam.gserviceaccount.com”, data.google_projects.cf_enabled_prj.*.project_id)
}
# And add them to an access levelresource “google_access_context_manager_access_level” “cloud_functions_access_level” {
parent = "accessPolicies/${access_policy_name}"
name = "accessPolicies/${access_policy_name}/accessLevels/cloud_functions_access_level"
title = "cloud_functions_access_level"
basic {
conditions {
members = [
concat(
local.cf_sa_list,
local.cloudbuild_sa_list,
local.cloudbuild_agent_sa_list,
local.runtime_cf_sa_list)
]
}}}
# Include the access level in the service perimeterresource "google_access_context_manager_service_perimeter" "perimeter" {
parent = "accessPolicies/${access_policy_name}"
name = "accessPolicies/${access_policy_name}/servicePerimeters/perimeter"
title = "perimeter"
status {
resources = var.restricted_project_numbers
restricted_services = var.restricted_apis
access_levels = [
google_access_context_manager_access_level.cloud_functions_access_level.name
]
}
}

Well, now you can go off and use Cloud Functions securely! That was quite a ride.

Oftentimes, when building something new we get to discover corners of the technology that are not very well described or documented. Bleeding edge, we like to call it?

But the tools are there for you — use them, work with your teams, adapt to your workflows — and you will find just the right solution.

This article was first written for DevSecCon “SecAdvent Calendar” 2020: https://www.devseccon.com/the-misadventures-of-one-cloud-function-secadvent-day-14-2/

--

--

Natalie Godec
Google Cloud - Community

That girl from Cloud with purple hair | Senior Cloud Architect at Zencore | GDE in Cloud | Here I talk about clouds, infrastructure and platform engineering