Accessing AWS Resources from Google Kubernetes Engine

Steven Aldinger
TeamSnap Engineering
7 min readFeb 15, 2024

My team had a requirement to integrate with a project in AWS, but the primary infrastructure we were working with was all in Google Cloud already. It only takes a few steps to set up a service account user in AWS and generate an API key to use for authenticating from GCP, but using a long-lived key isn’t best practice and felt like too much of a security risk to use in a production solution.

With AWS support for Open ID Connect Federation, and Google workload identity federation in GKE, setting up a secure solution with short-lived credentials is pretty easy. It’s much more tedious than just putting an API key in an environment variable and calling it done, but still reasonable to set up in short order if you already know the path to take.

There’s some decent documentation around this from both AWS and GCP, but it’s easy to stray down complicated tangents in the documentation, and it felt a bit more involved than it needed to be to figure out how to implement properly, so the purpose of this article is to show a direct path with no extraneous detail to sidetrack. The example here will be showing how to provide secure AWS admin access with short-lived credentials to an Atlantis server, but this is a general solution that will work for any applications running in GKE.

Strategy Overview

There’s a handful of distinct pieces involved here. The combination of things might seem overwhelming at first, but no individual step is very complicated, and each step has it’s own section further down to elaborate anywhere it seemed like it’d be helpful.

  1. [GCP] Create a service account to use for identity.
  2. [GKE] Create a K8s service account in the namespace of the application that needs AWS access.
  3. [GCP] Create an iam.workloadIdentityUser role binding that allows the K8s service account to act as the Google service account.
  4. [AWS] Configure an IAM role that allows authentication from the Google service account.
  5. [AWS] Configure an IAM policy binding that allows the service account’s role to access whichever resources the application needs.
  6. [Application] Configure an AWS profile with a credentials script in the application’s container that can fetch fresh short-lived credentials as necessary.
  7. [Application] Refer to the AWS profile in the application.

Create the GCP Service Account

The first step is to create a service account in Google Cloud IAM. This service account doesn’t need any roles granted, since it’s being used purely for identity.

module "atlantis_svc_account" {
source = "terraform-google-modules/service-accounts/google"
version = "4.2.2"

project_id = "gcp-project-id"
names = ["atlantis"]
display_name = "atlantis"
description = "Atlantis Service Account"

project_roles = []
}

There’s nothing special about the service account creation, so if you want to use an existing service account or create it in the Google Cloud UI, or with Google’s Terraform provider resources directly, that’ll work fine. I use terraform-google-modules/service-accounts/google for other things so I’m using it here too. The simplest version might look like the following snippet instead.

resource "google_service_account" "atlantis" {
account_id = "atlantis"
display_name = "Atlantis Service Account"
}

Create a Kubernetes Service Account

Create a K8s service account with the special annotation iam.gke.io/gcp-service-account, which needs it’s value to match the email address of the Google Cloud service account. You can follow along with Google’s documentation here, but just looking at the declarative version of what’s being accomplished is always clearer to me personally, so here’s what that looks like.

---
apiVersion: v1
kind: ServiceAccount
metadata:
name: atlantis
namespace: atlantis
annotations:
iam.gke.io/gcp-service-account: atlantis@gcp-project-id.iam.gserviceaccount.com

Create a workloadIdentityUser Role Binding

This step gives the K8s service account permission to assume the identity of the Google Cloud service account. Prior to workload identity federation in GKE, this may have been accomplished in a less-secure way by mounting a JSON file with GCP service account credentials inside the application container. One advantage of using workload identity for this is that there isn’t a long-lived credentials file exposed anywhere in the process, so if there’s a security breach, any credentials stolen will be useless soon after they’re taken.

The service account reference for workload identity federation in GKE looks a little interesting compared to the email address format that’s more typical in Google Cloud. The format is serviceAccount:GCP_PROJECT_ID.svc.id.goog[KUBE_NAMESPACE/KUBE_SVC_ACCOUNT_NAME], where KUBE_NAMESPACE and KUBE_SVC_ACCOUNT_NAME depend on the K8s service account creation in the previous step. In this example the K8s namespace and K8s service account are both named atlantis, so the service account reference looks like serviceAccount:gcp-project-id.svc.id.goog[atlantis/atlantis].

resource "google_service_account_iam_binding" "atlantis_k8s_workload_identity" {
service_account_id = module.atlantis_svc_account.service_account.name
role = "roles/iam.workloadIdentityUser"

# "serviceAccount:GCP_PROJECT_ID.svc.id.goog[KUBE_NAMESPACE/KUBE_SVC_ACCOUNT_NAME]"
members = ["serviceAccount:gcp-project-id.svc.id.goog[atlantis/atlantis]"]
}

Halfway There!

At this point, all the setup on the GCP side of things is finished, and the project is ready to use the GCP service account with OIDC federation in AWS, but we still need to configure the AWS project to recognize the service account and allow the specific access we want to give it.

Configure an AWS Role

The role demonstrated here is based on this AWS documentation, where 123456789012345678900 is the unique ID of the Google Cloud service account. This ID can be found in the GCP console UI under the Unique ID section of the service account’s details page.

The accounts.google.com:oaud field in the role definition is arbitrary, but whatever value you decide to use needs to match the credentials script in an upcoming step. In this example, I’m using AtlantisAccess as the oaud so it’s obvious what its purpose is supposed to be at a glance.

The principal value {"Federated": "accounts.google.com"} is boilerplate, but is also the real magic here, immediately integrating with Google’s IAM solution without explicitly configuring identity pools or anything complicated between the two cloud providers.

locals {
# hardcode if you want
svc_acct_id = "123456789012345678900"

# use the module output instead if that makes sense with your setup
#
# svc_acct_id = module.atlantis_svc_account.service_account.unique_id

# use the resource output if you created the service account that way
#
# svc_acct_id = google_service_account.atlantis.unique_id
}

resource "aws_iam_role" "atlantis_google_workload_identity_role" {
name = "atlantis_google_workload_identity_role"
assume_role_policy = <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {"Federated": "accounts.google.com"},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": {
"accounts.google.com:aud": local.svc_acct_id,
"accounts.google.com:oaud": "AtlantisAccess",
"accounts.google.com:sub": local.svc_acct_id
}
}
}
]
}
EOF
}


output "atlantis_role_arn" {
value = aws_iam_role.atlantis_google_workload_identity_role.arn
}

Configure an AWS policy binding

For the Atlantis use case, you might want to give blanket admin permissions using the special arn:aws:iam::aws:policy/AdministratorAccess ARN that AWS provides by default.

resource "aws_iam_role_policy_attachment" "atlantis_admin" {
role = aws_iam_role.atlantis_google_workload_identity_role.name
policy_arn = "arn:aws:iam::aws:policy/AdministratorAccess"
}

If you’re looking to lock things down more explicitly, you can attach a custom policy instead. The following snippet shows an example of how you might allow listing secrets throughout the project, and only allow reading secrets hosted in a specified aws_region.

data "aws_iam_policy_document" "get_secrets" {
statement {
actions = [
"secretsmanager:GetResourcePolicy",
"secretsmanager:GetSecretValue",
"secretsmanager:DescribeSecret",
"secretsmanager:ListSecretVersionIds",
]
resources = ["arn:aws:secretsmanager:${var.aws_region}:${var.aws_account_id}:secret:*"]
effect = "Allow"
}

statement {
actions = ["secretsmanager:ListSecrets"]
resources = ["*"]
effect = "Allow"
}
}

resource "aws_iam_policy" "secrets_manager_policy" {
name = "example-secrets-manager-policy"
description = "Read-only access to secrets manager"
path = "/example/"

policy = data.aws_iam_policy_document.get_secrets.json
}

resource "aws_iam_role_policy_attachment" "secret_manager_read_only" {
role = aws_iam_role.atlantis_google_workload_identity_role.name
policy_arn = aws_iam_policy.secrets_manager_policy.arn
}

Configure AWS Credentials Script and Profile

This script is nearly copy-paste from AWS documentation, but it’s worth highlighting a few things going on here.

The AUDIENCE variable needs to match the accounts.google.com:oaud field in the AWS role definition from the role configuration step, and the ROLE_ARN variable needs to match the atlantis_role_arn output .

One tricky thing to be aware of, is that this script prints the credentials to be captured in stdout. That means if you’re looking to add debug statements to the script, you need to make sure nothing else ends up in stdout or the debug statements themselves will become a bug.

Another potential “gotcha” is that the script assumes curl and jq are both installed in the container. Be sure to apk add --no-cache curl jq or apt install -y curl jq in your Dockerfile before trying to use this with an application.

#!/bin/bash

AUDIENCE="AtlantisAccess"
ROLE_ARN="arn:aws:iam::${AWS_ACCOUNT_ID}:role/atlantis_admin_google_workload_identity_role"

jwt_token=$(curl -sH "Metadata-Flavor: Google" "http://metadata/computeMetadata/v1/instance/service-accounts/default/identity?audience=${AUDIENCE}&format=full&licenses=FALSE")
jwt_decoded=$(jq -R 'split(".") | .[1] | @base64d | fromjson' <<< "$jwt_token")

jwt_sub=$(echo -n "$jwt_decoded" | jq -r '.sub')

credentials=$(aws sts assume-role-with-web-identity --role-arn $ROLE_ARN --role-session-name $jwt_sub --web-identity-token $jwt_token | jq '.Credentials' | jq '.Version=1')

echo $credentials

The file containing the script can be placed anywhere on the same machine it’s being used, but to actually reference the script, we need to refer to it in an AWS profile config.

The documentation explains that the default AWS config directory is in the user’s home directory, so the appropriate place for this config file might be /root/.aws/config or something like /home/atlantis/.aws/config depending on the container image being used. In this example profile, the credentials script was placed at /opt/bin/credentials.sh along with a chmod +x to make sure the script is executable.

[profile AtlantisAccess]
credential_process = /opt/bin/credentials.sh

Using the AWS credentials profile

This is the fun part. AWS clients seem to all have a way to refer to a credentials profile by name to use for authentication. For the AWS Terraform provider, you can easily set up auth by matching the profile field to whatever you put for the [profile AtlantisAccess] piece of the ~/.aws/config file.

provider "aws" {
region = "us-east-1"
profile = "AtlantisAccess"
}

Referring to the profile in application code is nearly as easy. This snippet demonstrates what that looks like in a C# application.

using Amazon;

...

var chain = new Amazon.Runtime.CredentialManagement.CredentialProfileStoreChain();
chain.TryGetAWSCredentials("AtlantisAccess", out var credentials);

... use `credentials` variable in some other AWS code

Conclusion

Authenticating to AWS from GKE in a secure way takes a few steps to set up, but Terraform can help make the process painless. The general OIDC federation strategy used here to enable Google service accounts to authenticate with AWS can be applied across GCP products with nearly the exact same steps, just by replacing the workloadIdentityUser role binding step with something other than the Kubernetes service account reference, or removing that step entirely if it makes sense to use the Google service account credentials directly.

Adorable Kubernetes service account bot accessing cloud resources

--

--