Solving the Workload Identity sameness with IAM Conditions
Update (28 Feb 2024): This solution was more popular than I anticipated. I received a lot of requests and questions and reports to tell me it doesn’t work. As of today Feb 28, 2024. I confirm it does. The IAM condition based on request.auth.claims.google.providerId works. BUT
It’s not how it’s meant to work. There are not garantees the request.auth.claims.google.providerId will continue working further. The solution is to organise clusters per projects instead of creating all the clusters into the same project. In other terms if you want to avoid the Workload Identity Samesness, use GCP Projects as Identity boundaries.
Context
GKE offers a unique feature called Workload Identity. This feature allows you to configure a Kubernetes Service Account (will call this one KSA for the remaining of the article) to use a Google Service Account (will call this one GSA for the remaining of the article) to access a Google API without having to manually download an inject Service Account Keys into Kubernetes Secrets or worst hard coding these in your repo. This is done in six steps:
- Creating a cluster with Workload Identity enabled
- Creating a GSA
- Create a Kubernetes namespace and KSA
- Annotating the namespace with the GSA to you
- Creating an IAM policy binding to allow the KSA to use the GSA. Via the workloadIdentityUser role
- Granting the GSA access to the GCP resource(bucket for example)
Workload Identity however has a single Identity pool per project, which means two identical KSA’s across two identical namespaces across two GKE clusters, will inherit the same IAM policy binding and therefore the same permissions. This is called the “Identity sameness”. It’s best explained in this doc
Depending on how you collocate your clusters and GCP projects, this could or could not be a problem for you. Again the doc has a good example on a theoretical scenario where an actor with good or bad intention could inherit existing permissions on a Cluster to their own workload. For example by either creating a new Cluster in an existing Project which has Workload Identity already configured for a legit workload Or by creating a namespace and KSA matching those of an existing cluster.
I would argue if someone in your organization was able to exploit this than you have a much bigger problem you need to solve with Access, Audit and Control. But nerveless for the cautious people out-there, there is a solution, actually two.
Solutions
1# The most obvious one(which is also stated in our public doc) is to separate the GKE clusters into their own GCP projects, this can be an option, it does however increase your maintenance overhead
2# Use IAM conditions to enforce which cluster can consume which IAM policy binding, the rest of this article demonstrates to how this can be implemented in practice.
Demo IAM conditions
In this Demo we will:
- Create two clusters gke-eu-west4 and gke-eu-west6
- Create a Cloud Storage Bucket gs://workload-identity-${PROJECT_ID}
- Create a GSA and grant it an admin role on the bucket
- Configure the IAM policy binding and the annotations needed by Workload Identity.
- Demonstrate that without the IAM condition, the default KSA in the default namespace in both clusters can access the bucket.
- Add an IAM condition to the policy binding to only allow the gke-eu-west4 cluster to use the binding.
- Verify that we are able to access the bucket from the gke-eu-west4 cluster and NOT from the gke-eu-west6 one.
NB: This demo will incur costs associate with the creation of the clusters, make sure to clean up your project after you are done testing.
For this demo i highly recommend you install kubectx. We will be working with two clusters, kubectx makes it easier to switch the kubectl context. For the remaining of the article i will assume you have kubectx installed
Export your GCP project ID to a variable
export PROJECT_ID=my-project-id
Create a cluster in europe-west4
gcloud beta container clusters create “gke-eu-west4” \
--project ${PROJECT_ID} \
--region “europe-west4” \
--cluster-version “1.19.8-gke.1600” \
--num-nodes “3” \
--enable-private-nodes \
--enable-ip-alias \
--master-ipv4-cidr “172.16.1.0/28” \
--workload-pool “${PROJECT_ID}.svc.id.goog” \
--no-enable-master-authorized-networks \
--async
Create a cluster in europe-west6
gcloud beta container clusters create “gke-eu-west6” \
--project ${PROJECT_ID} \
--region “europe-west6” \
--cluster-version “1.19.8-gke.1600” \
--num-nodes “3” \
--enable-private-nodes \
--enable-ip-alias \
--master-ipv4-cidr “172.16.0.0/28” \
--workload-pool “${PROJECT_ID}.svc.id.goog” \
--no-enable-master-authorized-networks \
--async
Get the access token for both clusters
gcloud container clusters get-credentials gke-eu-west4 --region europe-west4
gcloud container clusters get-credentials gke-eu-west6 --region europe-west6
Rename the kubectl contexts to make switching between them easy in the future
kubectx gke-eu-w6=gke_${PROJECT_ID}_europe-west6_gke-eu-west6
kubectx gke-eu-w4=gke_${PROJECT_ID}_europe-west4_gke-eu-west4
Create a test GCS bucket
gsutil mb -l eu -p ${PROJECT_ID} gs://workload-identity-${PROJECT_ID}
NB: In the previous command we added the Project ID to the bucket name, since Google Cloud Storage Bucket have global names and they has to be unique globally.
Create a GSA to be used by both clusters (demonstrate the Workload Identity namespace sameness)
gcloud iam service-accounts create gsa-wi-sameness
Grant the GSA access to the Bucket
gsutil iam ch serviceAccount:gsa-wi-sameness@${PROJECT_ID}.iam.gserviceaccount.com:roles/storage.admin gs://workload-identity-${PROJECT_ID}
Allow the default Kubernetes Service Account in the Default namespace to use the GSA
gcloud iam service-accounts add-iam-policy-binding \
--role roles/iam.workloadIdentityUser \
--member “serviceAccount:${PROJECT_ID}.svc.id.goog[default/default]” \ gsa-wi-sameness@${PROJECT_ID}.iam.gserviceaccount.com
Annotate the default Kubernetes Service account in the gke-eu-west4 cluster.
kubectx gke-eu-w4
kubectl annotate serviceaccount \
— namespace default default \
iam.gke.io/gcp-service-account=gsa-wi-sameness@${PROJECT_ID}.iam.gserviceaccount.com
Verify you can access the bucket from the default namespace in the gke-eu-west4 cluster
kubectl run -it \
--image google/cloud-sdk:slim \
--serviceaccount default \
--namespace default \
workload-identity-test
Run gcloud auth list, you should be logged in as the GSA
root@workload-identity-test:/# gcloud auth list
Credentialed AccountsACTIVE ACCOUNT* gsa-wi-sameness@${PROJECT_ID}.iam.gserviceaccount.comTo set the active account, run:$ gcloud config set account `ACCOUNT`
Create an empty file, upload it to the bucket and verify it
root@workload-identity-test:/# touch file-from-west4
root@workload-identity-test:/# gsutil cp file-from-west4 gs://workload-identity-${PROJECT_ID}
Copying file://file [Content-Type=application/octet-stream]…
/ [1 files][ 0.0 B/ 0.0 B]
Operation completed over 1 objects.
root@workload-identity-test:/# gsutil ls gs://workload-identity-${PROJECT_ID}
gs://workload-identity-${PROJECT_ID}/file-from-west4
So far we demonstrated how Workload Identity works, now let’s demo namespace sameness
Cleanup
exit
kubectl delete pod workload-identity-test
Annotate the default Kubernetes Service Account in the gke-eu-west6 cluster
kubectx gke-eu-w6
kubectl annotate serviceaccount \
— namespace default default \
iam.gke.io/gcp-service-account=gsa-wi-sameness@${PROJECT_ID}.iam.gserviceaccount.com
Verify you can access the bucket from the default namespace in the gke-eu-west6 cluster
kubectl run -it \
--image google/cloud-sdk:slim \
--serviceaccount default \
--namespace default \
workload-identity-test
Run gcloud auth list, you should be logged in as the GSA
root@workload-identity-test:/# gcloud auth list
Credentialed Accounts
ACTIVE ACCOUNT
* gsa-wi-sameness@${PROJECT_ID}.iam.gserviceaccount.com
To set the active account, run:
$ gcloud config set account `ACCOUNT`
Create an empty file, upload it to the bucket and verify it
root@workload-identity-test:/# touch file-from-west6
root@workload-identity-test:/# gsutil cp file-from-west6 gs:// gs://workload-identity-${PROJECT_ID}
Copying file://file [Content-Type=application/octet-stream]…
/ [1 files][ 0.0 B/ 0.0 B]
Operation completed over 1 objects.root@workload-identity-test:/# gsutil ls gs://workload-identity-${PROJECT_ID}
gs://workload-identity-${PROJECT_ID}/file-from-west6
So we demonstrated the identity sameness problem, now let’s see how we can solve it with IAM Conditions
Cleanup
exit
kubectl delete pod workload-identity-test
For the purpose of this demo we will edit the IAM role binding we configured earlier and will introduce a condition to say that only the gke-eu-west4 cluster can access the Google Service Account which has permissions on the bucket. We will demonstrate that even if the gke-eu-west6 has the annotations in place for the default Service Account in the default namespace, it will not be able to access the bucket because of the IAM condition
Remove the existing IAM policy binding
gcloud iam service-accounts remove-iam-policy-binding \
--role roles/iam.workloadIdentityUser \
--member “serviceAccount:${PROJECT_ID}.svc.id.goog[default/default]” \
gsa-wi-sameness@${PROJECT_ID}.iam.gserviceaccount.com
Add it again with the condition
gcloud iam service-accounts add-iam-policy-binding \
--role roles/iam.workloadIdentityUser \
--member “serviceAccount:${PROJECT_ID}.svc.id.goog[default/default]” \
--condition=”expression=request.auth.claims.google.providerId=='https://container.googleapis.com/v1/projects/${PROJECT_ID}/locations/europe-west4/clusters/gke-eu-west4',description=single-cluster-acl,title=single-cluster-acl" \
gsa-wi-sameness@${PROJECT_ID}.iam.gserviceaccount.com
This will create the policy binding between the KSA and the GSA, the condition indicate that this policy binding can only be used if the member request to use the role is in the gke-eu-west4 cluster
Now let’s verify that trying to access the bucket from the gke-eu-west4 cluster but not from gke-eu-west6
kubectx gke-eu-w4
kubectl run -it \
--image google/cloud-sdk:slim \
--serviceaccount default \
--namespace default \
workload-identity-test
Try to list objects in the bucket
root@workload-identity-test:/# gsutil ls gs://workload-identity-${PROJECT_ID}
gs://workload-identity-${PROJECT_ID}/file-from-eu-west4
gs://workload-identity-${PROJECT_ID}/file-from-eu-west6
We are able to see the files, so the IAM condition and the binding are working as intended.
Now let’s switch context to the gke-eu-west6 cluster and try to access the bucket.
exit
kubectx gke-eu-w6
kubectl run -it \
--image google/cloud-sdk:slim \
--serviceaccount default \
--namespace default \
workload-identity-test
Try to list objects in the bucket
root@workload-identity-test:/# gsutil ls gs://workload-identity-${PROJECT_ID}
Traceback (most recent call last):File “/usr/lib/google-cloud-sdk/platform/gsutil/third_party/apitools/apitools/base/py/credentials_lib.py”, line 227, in _GceMetadataRequest
response = opener.open(request)
File “/usr/lib/python3.7/urllib/request.py”, line 531, in open
response = meth(req, response)
File “/usr/lib/python3.7/urllib/request.py”, line 641, in http_response
‘http’, request, response, code, msg, hdrs)
File “/usr/lib/python3.7/urllib/request.py”, line 569, in error
return self._call_chain(*args)
File “/usr/lib/python3.7/urllib/request.py”, line 503, in _call_chain
result = func(*args)
File “/usr/lib/python3.7/urllib/request.py”, line 649, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 403: Forbidden
So we are getting a 403: Forbidden error, this means the IAM condition is working as intended
Cleanup
gcloud container clusters delete gke-eu-west4 --region europe-west4 --async --quietgcloud container clusters delete gke-eu-west6 --region europe-west6
--async --quietgsutil rm -r gs://workload-identity-${PROJECT_ID}/gcloud iam service-accounts delete gsa-wi-sameness@${PROJECT_ID}.iam.gserviceaccount.com --quiet