Securely accessing Google Cloud through GitHub with Workload Identity Federation

Ivo Patty
Appsbroker CTS Google Cloud Tech Blog
7 min readJan 13, 2023

Setting up CI/CD pipelines is vital to any good software project. However, setting it up correctly may not always be the easiest part. Allowing access to your cloud resources from your pipeline is often done using a Service Account (SA). The easiest way is to store a long-lived access key in the repository’s Secrets Manager.

Using these long-lived Google Cloud Service Account Keys (SA Key) brings inherent risks. Once a key is obtained there’s no further validation of authority or identity. If a highly privileged SA Key is obtained a bad actor basically has free rein over your Google Cloud project.

In this blog post, I’ll talk you through a better way of enabling headless access to your Google Cloud environment. One that does not require you to generate SA keys that can be stolen. Instead, we will be using Workload Identity Federation to associate our CI/CD environment with Google Cloud to generate precise and short-lived access tokens. This allows us to grant access to the necessary APIs to execute tasks only when necessary.

What is Workload Identity Federation

We’ve briefly talked about Service Accounts and how their keys are used as the traditional way of enabling external access to Google Cloud. Workload Identity Federation (WIF) builds on top of Service Accounts and Identity and Access Management (IAM) APIs. It allows you to grant external identities IAM roles, including the ability to impersonate service accounts, eliminating the maintenance and security burden associated with service account keys

Authentication flow when using Google Workload Identity Federation (source: Google Blog)

WIF uses OpenID and the OAuth 2.0 token exchange to take a credential from an external identity provider; in this example we will be using GitHub as the Identity Provider, providing information in exchange for a Google Cloud token. By looking at the attributes provided by the Identity Provider (GitHub), we can then make decisions on whether to issue a token and what kind of permissions to grant.

Let’s get to building

In this example, we will build a repository containing some code we want to run, a GitHub pipeline and our infrastructure definitions in Terraform. To get started we’ll assume you have the following:

  • a GitHub account and Git installed;
  • Google Cloud project
  • Terraform
  • gcloud CLI
  • Python

Bootstrapping the Terraform project

To get started we need to set up the right Terraform modules to access GitHub and Google Cloud. We’ll also need to enable the correct APIs and declare which variables we are going to use. If you’re familiar with the process, feel free to skip ahead.

Enable the short-lived credentials API

In our project, we’ll declare three variables, one for the Google Cloud project we’re using and two for the repository name and workspace where our code will reside. This will allow Terraform to configure both environments accordingly.

Creating the pool

A basic project is in place, so we can get started with setting up our Identity Federation. Within Google Cloud WIF uses Identity Pools to verify credentials and exchange them. Let’s create one:

Once an Identity Pool exists we can start configuring it. Our first step is configuring GitHub as an Identity Provider for our pool. This allows GitHub credentials and tokens to be exchanged for credentials to the Service Account we will create in a minute.

In this provider configuration you can see that we are configuring the URL where Google Cloud can check the validity of our token. We then take some of the claims of our GitHub token and map it to values in Google Cloud. Using these attributes of the token allows us to make intelligent decisions at runtime on which ‘users’ to allow. In this case we’re using the repository name to limit which repositories to allow, essentially preventing access from forked repos.

Accessing the Service Account

With a Identity Pool in place, our next step is to add identities to it. This can be done by creating a Service Account and allowing specific federated users from the pool to impersonate it.

With a Service Account created we can start linking it to the Workload Identity Pool. When WIF allows you to impersonate a Service Account this is done based on the assertions we previously assigned to attributes. These will act as a stand-in for the usual IAM roles or users that are defined in Google Cloud. In our case we want to allow our repository to access Google Cloud:

Linking to GitHub

Now that we’ve set up our Google Cloud project to exchange GitHub tokens for our associated Service Account, we’ll need to configure our GitHub action to request access. Fortunately for us, Google provides an abstraction task that requests a Google Cloud token for us and makes it available from within the SDK. We simply need to add the auth task to our GitHub Actions YAML file and configure it

The job itself depends on the presence of two variables, WIF_POOL which is the full resource name of the Identity Pool we created, and the SA_EMAIL which is the email address of the Service Account we want to impersonate. As with all of our resources, we’ll use Terraform to provision them in GitHub.

To get access to our repository from Terraform we’ll use a Personal Access Token, which you can generate here or generate a new fine-grained token. This token will need repo access for classic or Secrets R/W access for fine-grained tokens.

Make sure to export it to your environment before running Terraform apply.

export GITHUB_TOKEN=ghp_MORETOKENHERE

Lets now inject our secret values into GitHub Actions

With the secrets in place that’s basically all you need to get access to your Google Cloud project Service Account. We can now start adding tasks to our GitHub workflow and access our Google Cloud resources.

To keep things simple, let’s create a Python script that generates a random set of numbers and places it in a GCS bucket in our project.

You can see the exact Python code for it here. Once we have the file in our repository, we can use our authenticated GitHub actions to execute the code.

When we now commit our file and push it to GitHub we can see that our job triggers and creates a GCS bucket with the same name as our project and a file with 20 different numbers.

The pipeline creates a file when run

Cleaning Up

To prevent further charges to your project we can simply clean up our project by running Terraform Destroy, which will remove all resources created by our code, and disable the used APIs in our project.

terraform destroy

Note: WIF pools are maintained in a deleted state in GCP after terraform destroy. Recreating the environment will throw an error that the WIF pool already exists. You can change the pool name to overcome this.

Conclusion

With Workload Identity Federation in place, you can start expanding your operations in Google Cloud. You can use your GitHub actions to run tests directly on your environment or even run Terraform or other deployment tools directly within GitHub Actions.

WIF is a lot more flexible than the example we’ve just demonstrated as well. Currently, any branch, PR or push can request a Google Cloud access token. Within the IAM role binding that we’ve created above we have great flexibility to further restrict access, as Google describes here.

All of this together allows you to keep the principles of least privilege, even outside the boundaries of Google Cloud.

Special thanks to Lee Doolan for reviewing and testing the code in this blog

Further Reading

As you’ve just seen, it’s relatively easy to set up access to your Google Cloud project without ever needing to use a pre-shared key. While this example applies to use in GitHub, WIF can be used with any OAuth 2.0 and OpenID compatible system. This allows you to federate VM identities in AWS or Azure to Google Cloud or use GitLab or BitBucket instead.

Of course, this is only part of improving your security posture in Google Cloud. My colleague Alistair Grew has written a great article on how to handle perimeter-less security through BeyondCorp Enterprise.

About CTS

CTS is the largest dedicated Google Cloud practice in Europe and one of the world’s leading Google Cloud experts, winning 2020 Google Partner of the Year Awards for both Workspace and GCP.

We offer a unique full stack Google Cloud solution for businesses, encompassing cloud migration and infrastructure modernisation. Our data practice focuses on analysis and visualisation, providing industry specific solutions for; Retail, Financial Services, Media and Entertainment.

We’re building talented teams ready to change the world using Google technologies. So if you’re passionate, curious and keen to get stuck in — take a look at our Careers Page and join us for the ride!

--

--

Ivo Patty
Appsbroker CTS Google Cloud Tech Blog

Senior Data Engineer @CTS | Futurist | Data & MLOps | Photography | Thoughts are my own and not of my employer