Hey Google (Cloud API), trust my AWS Application

Daniel Strebel
Google Cloud - Community
8 min readSep 4, 2024

In today’s modern application landscape, where seamless and secure communication between services is crucial, ensuring authentication and trust between them is a common challenge. This is especially true in multi-cloud environments, where organizations use cloud-hosted solutions like Vertex AI or BigQuery in Google Cloud and access them from applications that are deployed in a variety of locations including Google Cloud, on-prem or other cloud providers.

In this article we explore Workload Identity Federation as a best practice for achieving authentication and trust in a multi-cloud scenario. In our example we have an application that runs on AWS and wants to consume Google Cloud services. We start off with a high-level introduction of Workload Identity Federation, and continue with the happy path of running Workload Identity Federation on EC2 virtual machines. Lastly we also discuss the pitfalls and introduce workarounds for implementing Workload Identity Federation for applications in AWS’s Elastic Container Service (ECS).

If you’re instead interested in how Workload Identity Federation works in GKE or GKE Fleets, check out these previous blog post:

Promise you no longer download service account keys!

Historically, administrators had to inject credentials like API keys, private certificates, or service account keys into workloads to provide them an identity that is known and trusted by the target service. For accessing Google Cloud APIs specifically this meant that you exported service account keys (basically a JSON file that contains a private key) and provided them to the application at runtime. This practice is explicitly discouraged in the service account documentation.

With Workload Identity Federation this process has become a lot more secure and simpler at the same time. It allows administrators to create trusted external credential issuers in a workload identity pool. The workload identity pool allows clients to exchange credentials that they obtained from these external issuers via the Security Token Service (STS) for access tokens that can be used when interacting with Google Cloud APIs.

Security is improved because many modern platforms like GKE, CI/CD platforms like GitLab CI or GitHub Actions automatically mount and rotate credentials and do not require any storage of sensitive credentials or API keys.

Workload Identity Federation on AWS is easy: EC2 with attached roles

Let’s dive deeper into Workload Identity Federation in a multi-cloud scenario between AWS and Google Cloud. Specifically we will use the automatically issued AWS Security Credentials to authenticate and securely access Google Cloud APIs including the Gemini APIs in Vertex AI.

Let’s start with the happy path and look at how Workload Identity Federation works in an AWS EC2 virtual machine:

For this we first need to create a workload identity pool and provider in Google Cloud:

# change these
export GCP_PROJECT_ID=<Your Project ID>
export AWS_ACCOUNT_ID=<Your AWS Account ID>

# change optionally
export POOL_ID=my-workload-pool
export PROVIDER_ID=my-provider

gcloud iam workload-identity-pools create "$POOL_ID" \
--location="global" --project="$GCP_PROJECT_ID"

gcloud iam workload-identity-pools providers create-aws "$PROVIDER_ID" \
--location="global" \
--workload-identity-pool="$POOL_ID" \
--account-id="$AWS_ACCOUNT_ID" \
--project="$GCP_PROJECT_ID"

With the pool and issuer configured, we can now generate a configuration file for workloads running in AWS:

GCP_PROJECT_NUMBER="$(gcloud projects describe $GCP_PROJECT_ID --format="value(projectNumber)")"

gcloud iam workload-identity-pools create-cred-config \
"projects/$GCP_PROJECT_NUMBER/locations/global/workloadIdentityPools/$POOL_ID/providers/$PROVIDER_ID" \
--aws \
--enable-imdsv2 \
--output-file=gcp-credentials-ec2.json

The generated configuration looks like this:

{
"universe_domain": "googleapis.com",
"type": "external_account",
"audience": "//iam.googleapis.com/projects/<GCP_PROJECT_NUMBER>/locations/global/workloadIdentityPools/my-workload-pool/providers/my-provider",
"subject_token_type": "urn:ietf:params:aws:token-type:aws4_request",
"token_url": "https://sts.googleapis.com/v1/token",
"credential_source": {
"environment_id": "aws1",
"region_url": "http://169.254.169.254/latest/meta-data/placement/availability-zone",
"url": "http://169.254.169.254/latest/meta-data/iam/security-credentials",
"regional_cred_verification_url": "https://sts.{region}.amazonaws.com?Action=GetCallerIdentity&Version=2011-06-15",
"imdsv2_session_token_url": "http://169.254.169.254/latest/api/token"
},
"token_info_url": "https://sts.googleapis.com/v1/introspect"
}

Note that this file does not contain any sensitive credentials. Instead it describes a token endpoint in the form of the EC2 metadata server. It can be reached via a link-local IP address when the application runs within an EC2 environment.

That’s it for the setup on the Google Cloud side for now. On the EC2 instance we need to:

  1. Ensure the EC2 instance has an IAM role attached.
  2. Upload the config file that we created before to arbitrary path like $HOME/gcp-credentials-e2.json
  3. Install the gcloud CLI (or use one of the Google Cloud SDKs for the programming languages together with the )

Now, we’re ready to verify our workload identity pool works as expected:

export GOOGLE_APPLICATION_CREDENTIALS="$HOME/gcp-credentials-ec2.json"
gcloud auth login - cred-file="$HOME/gcp-credentials-ec2.json"

The output of the login command should show the mapped principal that the workload identity pool created for our instance:

Authenticated with external account credentials for: 
[principal://iam.googleapis.com/projects/<GCP_PROJECT_NUMBER>/locations/global/workloadIdentityPools/my-workload-pool/subject/arn:aws:sts::<AWS_ACCOUNT_ID>:assumed-role/<INSTANCE_ROLE>/<INSTANCE_ID>].

We could already use this principal to assign it roles in Google Cloud IAM. However, the Instance ID is not the best semantic representation of the service that runs on AWS and changes if instances are deleted and re-created. Instead we probably want to use a mapped attribute that extracts the role used on AWS. In the attribute mapping of the workload identity provider for AWS you can see the following mapping:

Attribute Mapping in the GCP Workload Identity Provider for AWS

Which maps the google.subject to the AWS ARN and the attribute.aws_role via the following CEL expression:

assertion.arn.contains('assumed-role') ? 
assertion.arn.extract('{account_arn}assumed-role/') + 'assumed-role/' +
assertion.arn.extract('assumed-role/{role_name}/') : assertion.arn

This mapping transforms the AWS ARN of the instance into a principal set that we can use in Google Cloud IAM. In our concrete example the ARN of:

arn:aws:sts::<AWS_ACCOUNT_ID>:assumed-role/<INSTANCE_ROLE>/<INSTANCE_ID>

will be mapped into a principalSet of:

principalSet://iam.googleapis.com/projects/<GCP_PROJECT_NUMBER>/locations/global/workloadIdentityPools/my-workload-pool/attribute.aws_role/arn:aws:sts::<AWS_ACCOUNT_ID>:assumed-role/<IAM_ROLE>

When we assign the required roles to that principal set, we can allow our EC2 workload to perform the necessary request against the authorized APIs via the federated credentials.

E.g. if we give the principal set the role of Cloud Run Viewer, our EC2 instance is able to perform the following request via the gcloud cli:

gcloud run services list --project $GCP_PROJECT_ID

Of course the credentials can also be used via the Google Cloud SDKs to build applications that directly interact with Google Cloud Services.

Workload Identity Federation in ECS on Fargate

Based on what we did in EC2, we want to replicate the same behavior in a serverless environment and configure an ECS service to run on Fargate. Just like in our EC2 example we get the configuration JSON file for the Workload Identity Federation provider and configure an environment variable called GOOGLE_APPLICATION_CREDENTIALS that points to it. In theory the Google Cloud SDK should automatically pick up this configuration and use it to obtain a federated access token.

Note: If this worked you’re in luck and it probably means that support for ECS has been implemented since the writing of this article. You should proceed with the transparent configuration of the authentication provider and skip implementing the workaround described in this section.

When you try using the default configuration to obtain federated credentials does not work on ECS. Digging into the error logs should give some indication of what went wrong. Here’s what the error for the Python SDK reveals:

google.api_core.exceptions.ServiceUnavailable: 503 Getting metadata from 
plugin failed with error: HTTPConnectionPool(host='169.254.169.254', port=80):
Max retries exceeded with url: /latest/meta-data/iam/security-credentials

Apparently the credential_source that is specified in our generated configuration isn’t applicable in an ECS environment. Specifically the metadata on ECS is provided on a different link-local IP address 169.254.170.2 instead of the 169.254.169.254 that is used in EC2.

Does this mean we can’t use Workload Identity Federation on ECS? No, but it means that we need to add some customization in the standard behavior of the SDK on how it obtains credentials.

In a Python environment you could rely on the AWS’s boto3 SDK to obtain the security credential for the ECS service and implement a custom AWS Credentials Supplier:

from google.auth import aws
from google.auth import exceptions
import boto3
import os
from google.auth import environment_vars

class CustomAwsSecurityCredentialsSupplier(aws.AwsSecurityCredentialsSupplier):

def get_aws_security_credentials(self, context, request):
aws_credentials = boto3.Session().get_credentials().get_frozen_credentials()

audience = context.audience
try:
return aws.AwsSecurityCredentials(aws_credentials.access_key, aws_credentials.secret_key, aws_credentials.token)
except Exception as e:
raise exceptions.RefreshError(e, retryable=True)

def get_aws_region(self, context, request):
return "us-east-1"

credentials = aws.Credentials(
f"//iam.googleapis.com/projects/{os.getenv('GCP_PROJECT_NUMBER')}/locations/global/workloadIdentityPools/{os.getenv('WORKLOAD_IDENTITY_POOL_ID')}/providers/{os.getenv('WORKLOAD_IDENTITY_PROVIDER_ID')}",
"urn:ietf:params:aws:token-type:aws4_request",
aws_security_credentials_supplier=CustomAwsSecurityCredentialsSupplier(),
scopes=['https://www.googleapis.com/auth/cloud-platform']
)

This credential can then be embedded into the existing application code. Note that in the example below the custom credential provider is fenced with ECS-specific environment variable check such that we can use the same implementation in environments that are supported by the default configuration behavior:

...
gcp_credentials = None
# Use custom AWS Security Credentials Supplier
if os.getenv("AWS_EXECUTION_ENV") == "AWS_ECS_FARGATE":
from gcp_aws_credentials import credentials
gcp_credentials = credentials

vertexai.init(project=os.getenv("GCP_PROJECT_ID"), location="us-east1", credentials=gcp_credentials)
...

The full ECS example for Python can be found in this Github Repository.

The approach of customizing the credentials provider isn’t limited to the Python SDK. Similarly, the same customization can also be achieved in go as shown in this folder of the same repository. The custom token supplier here is based on the AWS go SDK as you can see below:

package ecs

import (
"context"
"fmt"
"os"

"github.com/aws/aws-sdk-go-v2/config"
"golang.org/x/oauth2"
"golang.org/x/oauth2/google/externalaccount"
)

// CustomAwsSecurityCredentialsSupplier implements the externalaccount.Supplier interface
type customAwsSecurityCredentialsSupplier struct{}

// AwsRegion retrieves the AWS region from the environment
func (s customAwsSecurityCredentialsSupplier) AwsRegion(ctx context.Context, options externalaccount.SupplierOptions) (string, error) {
region := os.Getenv("AWS_REGION")
if region == "" {
return "", fmt.Errorf("AWS_REGION environment variable is not set")
}
return region, nil
}

// AwsSecurityCredentials retrieves AWS credentials from the default config
func (s customAwsSecurityCredentialsSupplier) AwsSecurityCredentials(ctx context.Context, options externalaccount.SupplierOptions) (*externalaccount.AwsSecurityCredentials, error) {
conf, err := config.LoadDefaultConfig(ctx)
if err != nil {
return nil, fmt.Errorf("error loading AWS config: %w", err)
}

credentials, err := conf.Credentials.Retrieve(ctx)
if err != nil {
return nil, fmt.Errorf("error retrieving AWS credentials: %w", err)
}

return &externalaccount.AwsSecurityCredentials{
AccessKeyID: credentials.AccessKeyID,
SecretAccessKey: credentials.SecretAccessKey,
SessionToken: credentials.SessionToken,
}, nil
}

func GetECSTokenSource(ctx context.Context) (oauth2.TokenSource, error) {
// Read GCP Workload identity config from env variables
projectNumber := os.Getenv("GCP_PROJECT_NUMBER")
if projectNumber == "" {
return nil, fmt.Errorf("GCP_PROJECT_NUMBER environment variable is not set")
}

workloadPoolId := os.Getenv("WORKLOAD_IDENTITY_POOL_ID")
if workloadPoolId == "" {
return nil, fmt.Errorf("GCP_WORKLOAD_POOL_ID environment variable is not set")
}
providerId := os.Getenv("WORKLOAD_IDENTITY_PROVIDER_ID")
if providerId == "" {
return nil, fmt.Errorf("GCP_WORKLOAD_PROVIDER_ID environment variable is not set")
}

// Create an instance of your AWS Security Credentials Supplier
awsSupplier := customAwsSecurityCredentialsSupplier{}

// Create a GCP token source using the AWS credentials
// (assumes you have the necessary GCP permissions)
tokenSource, err := externalaccount.NewTokenSource(ctx, externalaccount.Config{
SubjectTokenType: "urn:ietf:params:aws:token-type:aws4_request",
AwsSecurityCredentialsSupplier: awsSupplier,
Audience: fmt.Sprintf("//iam.googleapis.com/projects/%s/locations/global/workloadIdentityPools/%s/providers/%s", projectNumber, workloadPoolId, providerId), // Replace with your GCP project number, pool ID, and provider ID
Scopes: []string{"https://www.googleapis.com/auth/cloud-platform"},
})
if err != nil {
return nil, err
}

return tokenSource, nil
}

The result is a TokenSource that can be used to customize the client option in the application and obtain the credentials in ECS. Just like in the Python example we also apply an environment variable check to apply the customization only when needed:

ops := []option.ClientOption{}

// Use custom AWS Security Credentials Supplier
if os.Getenv("AWS_EXECUTION_ENV") == "AWS_ECS_FARGATE" {
ecsTokenSource, err := ecs.GetECSTokenSource(ctx)
if err != nil {
fmt.Printf("Error getting ECS token source: %v\n", err)
http.Error(w, "Internal server error", http.StatusInternalServerError)
return
}
ops = append(ops, option.WithTokenSource(ecsTokenSource))
}

Conclusions

This article showcased the elegance and simplicity of Workload Identity Federation, demonstrating how it streamlines workload authentication via platform-native identity for seamless Google Cloud API access. We also demonstrated how flexible the implementation of Workload Identity Federation is by using the language-specific SDKs to provide a custom credential provider for ECS. Equipped with this knowledge, we are hopefully two steps closer to finally eliminating the need for exported service account keys once and for all.

--

--