Towards secure by default Google Cloud: Default service accounts

Published in

code.kiwi.com

8 min readAug 24, 2020

Have you heard that Cloud XYZ is secure because smart engineers in XYZ have made it that way? Unfortunately, as with many other products, this isn’t exactly true when you rely on the defaults. In this article, I’ll give you one specific example from Google Cloud Platform, along with a recommendation on how you can address it.

Tl;dr;

Google Cloud does have a few pretty insecure defaults. (even though being generally secure by default)
Be aware that any application running in GCP can reach the magic metadata server to get access token for Google APIs. This is a cool feature, but only if you’re aware of it. :-)
Downgrade permissions of both default service accounts from Editor. Even Google itself now explicitly recommends it and wrote a blog with a section about it.
For future projects, disable automatic role grants to default service accounts by applying iam.automaticIamGrantsForDefaultServiceAccounts organization constraint.

I needed a GCP security image for the thumbnail, so I *borrowed one from* *Santa Clara meetup* *:-)*

Google IAM primer 🤓

Google Service Account

A service account is a special kind of account used by an application or a virtual machine (VM) instance, not a person. Applications use service accounts to make authorized API calls.
For example, a Compute Engine VM may run as a service account, and that account can be given permissions to access the resources it needs. This way the service account is the identity of the service, and the service account’s permissions control which resources the service can access.

In short, Google Service Account is used to identify a service calling Google API. Everything running on GCP has its identity defined by the assigned service account, where generally it means that each service has a unique service account.

Metadata server ✨

This is a special server running in Google Cloud, reachable on the internal IP 169.254.169.254 (the same as on other cloud providers), or via internal DNS record metadata.google.internal. It is aware of the caller’s identity, which allows your application to have access to Google Cloud resources without any secret embedded in the application itself.

All Google SDKs are aware of this server, and if you don’t explicitly specify a different set of credentials (e.g. via GOOGLE_APPLICATION_CREDENTIALS env variable), it will reach out to this internal server for a short-lived access token.

Default Google Service Account 😱

If you don’t explicitly specify the service account which should be assigned to the app, it will receive the default service account. The catch is that Google assigns legacy Editor role to the default service account… by default.

This effectively means that each app started in GCP is overprivileged by default as the Editor role allows you to access and edit essentially everything in the project.

Currently, there are 2 default service accounts, both with Editor role:

Don’t confuse default service accounts and *Google-managed service accounts.*

Why is that a problem?

(In)Secure by default

It is pretty difficult to keep up the pace with new deployments in any fast-growing company. Because of that, it’s vital to have a platform with secure defaults thanks to which developers can focus on the business logic - a paved road as Netflix likes to call it. A great example of secure defaults is React with its dangerous functions - as long as you’re not doing something dangerous, you should be good.

How is this approached in Google Cloud IAM?

As mentioned in the previous section, if you don’t specify a service account, you’ll automatically be assigned one of the default ones. This means that all deployments in AppEngine, Cloud Run, or Cloud Functions have the right to edit or delete nearly everything in the project... by default. 🙃

Apps running on Compute/Kubernetes Engine have slightly better defaults thanks to the legacy method of access scopes, yet they can still access all Google Storage buckets in the project and possibly escalate from there. In Kubernetes <1.10, it also allowed read-write Compute access, which can be deadly in the wrong hands. If you’ve created your cluster more than 2 years ago and haven’t recreated it since then, you’ll see this scope assigned.

However, access scopes are effective only as long as you authenticate through OAuth and, more importantly, are a legacy, not a security mechanism.

Impact

If you follow the best practices and every application has a unique, least privileged service account, this insecure default should not have a direct impact on you.

Are you lucky enough to live in such a world?

However, achieving this state isn’t exactly an easy task if your organization consists of dozens of projects created without this tricky default in mind.

Exploitation

To exploit this, you would need to gain an initial foothold into some instance/application that has a default service account, and this is usually not as difficult as you might think. You don’t need an RCE or a compromised SSH key, a simple SSRF might be enough, all thanks to the metadata server. Many companies, including Capital One, were a victim of this simple SSRF→metadata server exploit chain and your older GCP workloads might still be at risk. Once the initial foothold is set and iam.serviceAccounts.actAs is present on the compromised service (by default in Editor), a lateral movement even beyond the project is possible.

Luckily, to query the metadata server from any new workload on GCP, you now need to include a Metadata-Flavor: Google header, which is considerably more difficult via SSRF. Kudos to Google Cloud for applying this defense in depth mechanism years before AWS, and now even officially scheduling the old metadata server API for shutdown. You can check if you still have any legacy metadata servers enabled viaLEGACY_METADATA_ENABLED finding in Google Security Command Center.

For more details, check these awesome GCP privilege escalation resources:

Tutorial on privilege escalation and post-exploitation tactics in Google Cloud Platform environments by @init_string, that is truly awesome!
Lateral Movement & Privilege Escalation in GCP and BSidesSF 2020 — The GCP Metadata API by Dylan Ayrey & Allison Donovan, thanks to which we even have the IAM constraint! Unfortunately, it made my blog post nearly redundant. :-)
Shopify’s $25k Bug Report, and the Cluster Takeover That Didn’t Happen by Greg Castle & Shane Lawrence
Privilege Escalation in Google Cloud Platform — Part 1 (IAM) by Spencer Gietzen

How to fix it?

If you want to address this issue in your organization, we recommend splitting your work into the following steps.

Step 1: Restrict creation of new overprivileged default service accounts

Before digging deeper into your existing projects, you should stop the creation of new overprivileged default service accounts. For this use-case, Google recently created iam.automaticIamGrantsForDefaultServiceAccounts organizational constraint (which they strongly recommend).

Before applying this constraint org-wide, you can optionally grant the minimum required permissions to run a GKE cluster or an AppEngine application, specifically, assigning monitoring.viewer, monitoring.metricWriter, and logging.logWriter roles should suffice. These roles assure that 95% of use-cases remain usable even without specifying a dedicated service account while keeping it secure by default as none of these roles brings a noteworthy risk to your project.

As all our projects are terraformed, this step was just a simple addition of google_organization_policy and a few google_project_iam_member resources to our Google project module. :-)

Step 2: Identify overprivileged default service accounts

This is probably the most tedious phase of your job. Here, you’ll need to go through all your projects IAM and manually review permissions used by default service accounts. Google IAM Recommender should give you pretty useful insights here, especially as Google continues to improve the product. However, sometimes it recommends storage-related permissions without any apparent reason, or it fails to recommend any change.

It’s important to note that you should manually review which app/instance is using these permissions as you’ll need to assign a dedicated service account to it. Log query for checking actions done by the service account is as simple as protoPayload.authenticationInfo.principalEmail=”<project-id>-compute@developer.gserviceaccount.com”, but be aware that Data Access audit logs are disabled by default.

Also, keep in mind that this review is a perfect time to check if one of your projects hasn’t been already compromised, so keep your eyes open to dig deeper into any suspicious activity in logs.

Step 3: Downgrade “unused” default service accounts

First, go ahead and downgrade all default service accounts that haven’t used any other than default permissions. You can help yourself with a simple bash script similar to this:

Don’t forget to communicate this step among devs and make sure that you have a clear point of contact for anyone who hits the IAM wall. 🙀

Step 4: Replace and downgrade remaining default service accounts

Pick up the list from step 2 and create a unique, least privileged service account for every service in the project that requires access to Google Cloud resources. Keep in mind that rotating a service account requires an instance rotation (GCE/GKE) or a redeployment (Cloud Functions).

Step 5: Monitor and enforce the new policy

What use is the policy if you don’t verify that people are obeying it, right? :-)
Applied constraint from step 1 will assure a secure default, but anyone with sufficient permissions can still grant additional roles to the default service accounts. Fortunately, it’s fairly easy to monitor GCP, especially if you have Forseti Security deployed (if you don’t, it takes minutes to do so).

Unfortunately, checking this specific type of issues is not possible yet, but you can still, at least, denylist the roles that worry you the most:

If you’re using Terraform, the best part is that you can use the same policy-library repo to enforce pre-deployment checks using Terraform Validator, and keep Forseti’s Config Validator as post-deployment monitoring of ad-hoc changes done outside of Terraform!

Before you act

For some reason, Google Cloud uses App Engine default service account for Datastore managed import/export. If you use this functionality, you’ll need to assign the required roles accordingly.

Thanks

@init_string for consulting, suggesting the iam.automaticIamGrantsForDefaultServiceAccounts org policy, and for writing the most comprehensive post-exploitation GCP guide I’m aware of. :-)
To Dylan Ayrey & Allison Donovan for digging into this topic much more deeply than me! (unfortunately, in parallel :))
GCP support for pointing out gotchas such as default service account being used by Managed Datastore/Firestore imports/exports.
Our amazing infra team that was truly helpful and is always patiently helping even with the most tedious tasks like this.

We’re hiring!

Thanks for reading all the way through! If you’ve enjoyed it, and you would be interested to work on similar problems at scale, please check our job listings or drop me a message Twitter (@s14ve)!