GCP Workload Identity Federation with Federated Tokens

Arend Dittmer
Google Cloud - Community
9 min readJul 13, 2023

--

When you develop applications that access any type of cloud based service you have to think about application authentication and authorization. In this article I will discuss application authentication in GCP and introduce GCP Workload Identity Federation. Then we will look at an example of context based authorization in GCP with Workload Identity Federation through Okta with Federated Tokens.

Authenticating Applications with GCP

Applications that need to access GCP services authenticate with GCP through service accounts. A Service Account is a type of identity that requires no interactive authentication. A Service Account is identified by a pseudo email address in the format

<Service Account Name> @<Project ID> .iam.gserviceaccount.com

GCP IAM roles and privileges can be assigned to Service Accounts through this email reference just as they are assigned to regular user identities. Service Accounts are created within a project but can assume roles and privileges in other projects. When an application authenticates with a Service Account’s identity it has access to all resources that the Service Account has permission to access.

Applications that run on Google Compute Engine and use GCP’s Cloud Client Libraries automatically authenticate through the Service Account that is linked to the GCE instance they are running on. The link between GCE instance and Service Account is created when the instance is created and can be modified after instance creation.

Applications that run outside of Google Cloud Platform (GCP) can use Service Account key files to authenticate to GCP services. Service Account key files can be generated and exported for each Service Account. Application Default Credentials (ADC) is a framework for client libraries to find GCP credentials. Client libraries can, for example, find a key file containing Service Account credentials through the environment variable GOOGLE_APPLICATION_CREDENTIALS that contains the full path to the key file. For a ‘hands-on’ walk through of different authentication methods from systems outside of GCP have a look at this blog post.

While it is relatively easy to use Service Accounts and key files to enable access to GCP services for applications that are running outside of GCP there are a number of problems with this approach

  • Key files have an indefinite lifetime and are valid until they are explicitly disabled or the corresponding Service Account is deleted
  • There is a significant risk of key exfiltration and frequent key rotation is required to limit the impact of a potential security breach
  • It is not possible to enable access control based on application context — an application has a fixed set of roles associated with the Service Account

Introducing Workload Identity Federation

GCP Workload Identity Federation is a new feature in GCP that enables application authentication through an outside identity provider (IdP) instead of a key file. Workload identity federation follows the OAuth 2.0 token exchange specification. It enables the exchange of an external identity in the form of a SAML or OIDC token for a Federated GCP Token. That Federated GCP token can either (1) directly represent a GCP Federated Identity or (2) be used to impersonate a Service Account through another token exchange.

The Identity mapping from an external identity to a GCP Federated Identity is configured through a GCP construct called Workload Identity Pool Provider. Workload Identity Pool Providers are logically grouped in Workload Pools. We will explore these constructs in more detail as we go through the example. Below, an overview of the flow for exchanging an OIDC identity token for a GCP access token.

From https://blog.Service Accountlrashid.dev/articles/2021/understanding_workload_identity_federation/

The application requests (1) and receives (2) an identity token from the external IdP. Request and receipt handling needs to be implemented in the application. The URL and metadata for exchanging the external identity token for a GCP Federated Identity token through GCP’s Security Token Service (STS) are captured in a configuration for credential generation. This configuration can be explicitly or implicitly passed to GCP client libraries. From this point on all following steps are transparently handled by the GCP client libraries.

The client library forwards the IdP token received in (2) to GCP’s STS (3). The STS evaluates the token (4a) to verify the audience of the external identity token and the token issuer. The external identity token is signed with the external IdP’s private key. The corresponding public keys that are used in the validation process are accessible as each OIDC provider publishes the URI for accessing these keys (jwks_uri) in the /.well-known/openid-configuration document . With these public keys the STS verifies the token signature (4b). STS returns a short lived (1 hr) federated GCP token (5). If you used Federated Identities in GCP IAM the token exchange process ends here.

If Service Account impersonation is configured the federated token is exchanged for a short-lived access token (1hr) that represents the Service Account credentials. The mapping of Federated Identity to a Service Account is configured in GCP IAM. With this token the GCP resource is accessed (8).

While it is helpful to understand how Workload Identity Federation works under the hood, GCP constructs hide most of the complexity. When setting up Workload Identity Federation most of the effort is spent on configuring the identity mapping through Workload Identity Pool Providers and relatively minor code changes in the application. Third party identities are also supported with Application Default Credentials, which makes it easy to use these identities in your code. Once you have created the config file for credential generation you can reference this configuration file through the environment variable GOOGLE_APPLICATION_CREDENTIALS and the GCP client libraries will automatically retrieve and exchange the third party identity tokens through the flow we discussed. If you want to explicitly pass credentials to a client library, you can instantiate a Credentials object from the config file for credential generation and pass that object as a credential to a client library. More on this in the example below.

Benefits of Workload Identity Federation

In contrast to Service Account authentication through key files Workload Identity Federation has the following benefits

  • No need to manage Service Account key files — you can disable Service Account key generation through org policy constraints
  • Access tokens are short lived
  • Authentication can be based on context through attributes in the identity tokens issued by the external IdP

Flask + Okta Hosted Login + GCP Workload Identity Federation

To give a sense of how Workload Identity Federation works and illustrate its benefits we will go through an example that builds on sample code provided by Okta. I forked the Okta repo and added code that integrates the original ‘Okta-Hosted Login’ sample with GCP Workload Identity Federation. The forked repo can be found here and the walkthrough below is based on the code in this repo. The setup of this example should be very straightforward and is described in the README. In this section we will focus on code changes and the configuration of Workload Identity Federation.

The original Okta sample code shows how to build a Flask based web application that users log into through an Okta Hosted Login page. The login process implements the OAuth 2.0 authorization code flow. After acknowledging the intent to log in the user is redirected to an Okta-Hosted login page. After the user authenticates, they are redirected back to the application with an access code that is then exchanged for an access token and an OIDC identity token. The original sample code then uses the access token to retrieve user information (unique_id, email and given name) from Okta’s OIDC userinfo endpoint. The user information is then displayed.

The original sample was extended to integrate with Workload Identity Federation to support GCS access authorization based on a user’s Okta group membership. The sample code assumes that three user groups have been created in Okta: ‘admin’, ‘user’ and ‘any’. The setup for the sample code generates three buckets, one for each group with the respective suffix. Each user group is only authorized to list objects in buckets that correspond to their group membership. After a user logs in, all objects the user is authorized to see are listed.

We first create a Workload Identity Pool (in setup.sh). The WORKLOAD_IDENTITY_POOL_ID can be freely chosen.

gcloud iam workload-identity-pools create "$WORKLOAD_IDENTITY_POOL_ID" \
--location="global" --description "Workload pool for Okta demo" \
--display-name "$WORKLOAD_IDENTITY_POOL_ID"

Next we create the Workload Identity Pool Provider that defines …

  • … the issuer URL — Okta’s endpoint for access and id token generation e.g. https://dev-12345678.okta.com/oauth2/default
  • … the intended audience of the external IdP token — in our case the application’s Okta client ID (Okta always sets the audience for an OIDC identity token to the client ID)
  • … the mapping of OIDC identity token attributes to Google STS token attributes — in this example we are only mapping the subject claim, which is always required, and the groups claim
gcloud iam workload-identity-pools providers create-oidc $WORKLOAD_IDENTITY_POOL_PROVIDER_ID \
--location="global" \ --workload-identity-pool "$WORKLOAD_IDENTITY_POOL_ID" \
--issuer-uri "$ISSUER_URL" \ --allowed-audiences "$AUDIENCE" \
--attribute-mapping "google.subject=assertion.sub,google.groups=assertion.Groups"

At the time of token exchange the GROUP_ID placeholder in the GCP IAM Federated Identity is replaced with the Okta user group name retrieved from the identity token.

principalSet://iam.googleapis.com/projects/PROJECT_NUMBER/locations/global/workloadIdentityPools/POOL_ID/group/GROUP_ID

Now we can define access policies for GCS based on Okta group memberships. Any user that is in any of the three Okta groups (‘admin’, ‘user’, ‘any’) will be limited to listing the corresponding buckets. There is no need to configure individual users in GCP IAM.

gcloud storage buckets add-iam-policy-binding gs://$BUCKET_PREFIX-any \
--member=principalSet://iam.googleapis.com/projects/$PROJECT_NUMBER/locations/global/workloadIdentityPools/$WORKLOAD_IDENTITY_POOL_ID/group/any \
--role=roles/storage.objectViewer

gcloud storage buckets add-iam-policy-binding gs://$BUCKET_PREFIX-user \
--member=principalSet://iam.googleapis.com/projects/$PROJECT_NUMBER/locations/global/workloadIdentityPools/$WORKLOAD_IDENTITY_POOL_ID/group/user \
--role=roles/storage.objectViewer

gcloud storage buckets add-iam-policy-binding gs://$BUCKET_PREFIX-admin \
--member=principalSet://iam.googleapis.com/projects/$PROJECT_NUMBER/locations/global/workloadIdentityPools/$WORKLOAD_IDENTITY_POOL_ID/group/admin \
--role=roles/storage.objectViewer

Now, let’s have a look at the code changes. The comments in the code were removed. After the application receives the identity token from the Okta URI the token is saved with a random file name in the temp directory (main.py):

    okta_oidc_credential_file_name = "/tmp/" + random_id + "_okta_oidc_cred"
with open(okta_oidc_credential_file_name, 'w', encoding='utf-8') as f:
f.write(id_token)

We create a configuration dictionary for the exchange of the external identity token against a GCP Federated Identity token. It contains the full path of the external identity token file and the STS endpoint. Note that the audience needs to be set to the workload identity pool provider resource name (main.py):

    audience_str = "//iam.googleapis.com/projects/"+project_number+"/locations/global/workloadIdentityPools/"+workload_identity_pool_id+"/providers/"+workload_identity_pool_provider_id

gcp_credential_dict = {
"type": "external_account",
"audience": audience_str,
"subject_token_type": "urn:ietf:params:oauth:token-type:jwt",
"token_url": "https://sts.googleapis.com/v1/token",
"credential_source": {
"file": okta_oidc_credential_file_name
},
}

Next we set the token scope, create a Credentials object from this dictionary and call list_objects function, a wrapper around the GCS API calls. Note that the credentials that define the token exchange can now be passed to GCP APIs (main.py):

    scopes = ['https://www.googleapis.com/auth/devstorage.full_control']
credentials = identity_pool.Credentials.from_info(gcp_credential_dict)
scoped_credentials = credentials.with_scopes(scopes)
project = os.getenv('PROJECT_ID')
allblobs = list_objects(project, scoped_credentials)

The list_objects functions tries to read from all three buckets with exceptions that prevent an exit if a ‘403 Permission Denied’ (or any other error) is received (helpers.py):

def list_objects(project, scoped_credentials):
storage_client = storage.Client(project=project, credentials=scoped_credentials)

bucket_prefix = os.getenv('BUCKET_PREFIX')
bucket_name_list = [bucket_prefix+"-any", bucket_prefix+"-user", bucket_prefix+"-admin"]
blob_name_list = []

# We iterate over the three buckets ignore 403's and other errors
for bucket_name in bucket_name_list:
bucket = storage_client.bucket(bucket_name)
try:
for blobs in bucket.list_blobs():
blob_name_list.append(blobs.name)
except:
continue

return blob_name_list

This sample shows how to leverage a user’s group identity as context for authorization. Beyond user group identities it is possible to map custom attributes and even combine attributes through CEL to address more complex requirements for context. It is also possible to use attribute conditions to check assertion attributes and target attributes. Federated principals with custom attributes can be referenced following the convention

principalSet://iam.googleapis.com/projects/PROJECT_NUMBER/locations/global/workloadIdentityPools/POOL_ID/attribute.ATTRIBUTE_NAME/ATTRIBUTE_VALUE

Support for Federated Identities in GCP

Many GCP offerings now support Federated Identities. Examples are BigQuery, Google Compute Engine, Dataflow, Dataproc, Pub/Sub. Some of these offerings have some limitations when used with Federated Identities. We are currently updating the documentation on this topc and will have a complete list shortly.

Summary

In this article we first discussed application authentication in GCP. Then we covered the drawbacks of Service Account keyfile based authentication for applications running outside of GCP. We looked into Workload Identity Federation as an alternative to using Service Accounts with keyfiles. We talked about the token exchange flow for Federated Identities as well as Service Account impersonation. Finally we illustrated the power of Federated Identities through an example that uses the Okta group id in an OIDC identity token as context, mapping this group id to a GCP Federated Identity. The code sample shows how you can ensure that GCS bucket access is automatically limited to users in the corresponding Okta group without requiring any IAM configuration changes when users are added in Okta.

--

--