Kubernetes Security Tools: Seccomp & AppArmor

Noah
6 min readMar 6, 2024

This is the first in a series of articles discussing an overview of various tools used for Kubernetes hardening and security more generally. In this we explore Secure computing mode and AppArmor — both popular tools used to control application access to the kernel.

Seccomp.

Seccomp is a Linux kernel feature used to provide fine-grained control over the system calls of an application — a sort of firewall for system calls. By restricting and auditing these, seccomp can help limit the access of (Kubernetes) workloads to the underlying nodes.

Syscalls can expose access to sensitive features and powerful vulnerabilities within the kernel. Container runtimes such as Docker implement default seccomp profiles to prevent calls, to mount or ioperm for example, which could be used for container escapes. Historically, vulnerabilies have also been uncovered such as CVE-2016–0728 in keyctl and CVE-2022–0185 in unshare.

Profiles. Given the above, Kubernetes supports the ability to associate a seccomp profile with a pods through its security context. Profiles operate using an allowlist — any syscalls not explicitly specified are subject to the defaultAction.

{
# Block any syscalls not listed below
"defaultAction": "SCMP_ACT_ERRNO",
"architectures": [
"SCMP_ARCH_X86_64",
"SCMP_ARCH_X86",
],
"syscalls": [
{
"name": "uname",
"action": "SCMP_ACT_ALLOW",
"args": []
},
{
"name": "accept",
# Log accept syscalls to syslog
"action": "SCMP_ACT_LOG",
"args": []
}
]
}

Within the securityContextof the pod, a seccompProfile takes a number of arguments. The type specifies whether the pod uses the container runtime default (RuntimeDefault), a profile from a local file on the worker (LocalHost) or no profile (Unconfined). The Kubelet’s --seccomp-default flag can enable the container runtime’s default profile for all workloads.

apiVersion: v1
kind: Pod
metadata:
name: local-seccomp-profile
spec:
securityContext:
seccompProfile:
# Profile from local node
type: Localhost
localhostProfile: profiles/violation.json
containers:
- name: container
image: nginx
----
apiVersion: v1
kind: Pod
metadata:
name: runtime-default-profile
spec:
securityContext:
# Container runtime default profile
seccompProfile:
type: RunTimeDefault
containers:
- name: test-container
image: nginx

Seccomp Profiles Operator. Currently Kubernetes lacks native support for distributing profiles to workers and generating fine-grained profiles. However, the security profiles operator implements various custom resources to address these problems.

Most obviously the SeccompProfile and ProfileBinding CRDs allow users to create and associate seccomp profiles with workloads through the Kubernetes API. The following example defines a profile to log all syscalls and associates this with all containers using the nginx image.

apiVersion: security-profiles-operator.x-k8s.io/v1alpha1
kind: SeccompProfile
metadata:
namespace: default
name: logging-profile
spec:
defaultAction: SCMP_ACT_LOG
----
apiVersion: security-profiles-operator.x-k8s.io/v1alpha1
kind: ProfileBinding
metadata:
name: logging-profile-binding
spec:
profileRef:
kind: SeccompProfile
name: logging-profile
image: nginx

Taking this a step further, the ProfileRecording resource can be used to automatically create tailored seccomp profiles. The following creates a profile recorder for selected pods using node syslogs and audit files. Support for eBPF-based recorders also exists.

apiVersion: security-profiles-operator.x-k8s.io/v1alpha1
kind: ProfileRecording
metadata:
name: webapp-profile-generator
spec:
kind: SeccompProfile
recorder: logs
podSelector:
matchLabels:
app: webapp

Having created the recorder, we then run our workload to completion and can retrieve the generated profile from the cluster SeccompProfiles.

$ kubectl create -f recorder.yaml
$ kubectl create -f pod.yaml
$ kubectl delete -f pod.yaml
$ kubectl describe seccompprofile -o wide

AppArmor.

AppArmor is a popular kernel module available by default on many distributions and supports Mandatory Access Control. It enhances normal control beyond traditional (file-based) permissions, enabling finer-grained control and greater defence in-depth.

As with seccomp, profiles are associated with particular applications and commonly defined in /etc/apparmor.d. These define access rules for resources that may be used by the application — such as files, capabilities, networking and hardware resources. Possible access modes include read, write, locking and appending.

#include <tunables/global>

profile k8s-apparmor-profile flags=(attach_disconnected)
#include <abstractions/base>

# Enable setgid
capability setgid

# disable IPv4 TCP
deny network inet tcp,

# Deny file writes in /tmp.
deny /tmp/** w
}

AppArmor supports both an enforcement mode and a complain mode where violations are only logged in the latter case. Profiles are loaded into the kernel using apparmor_parser [-C] <profile> and associated with workloads by applying an annotation of the format container.apparmor.security.beta.kubernetes/<container-name>: localhost/<profile-name>.

apiVersion: v1
kind: Pod
metadata:
name: apparmor-example
annotations:
container.apparmor.security.beta.kubernetes.io/test: localhost/k8s-apparmor-profile
# To apply runtime default:
# container.apparmor.security.beta.kubernetes.io/test: runtime/default
spec:
containers:
- name: test
image: nginx

Distributing Profiles. As with seccomp, Kubernetes does not provide any native solution to ensure profiles exist on all worker nodes. Kubernetes documentation proposes doing this manually, or relying on a daemon set, AppArmor Loader, to sync profiles.

The security profile operator previously discussed implements some support for this, allowing profiles to be deployed onto nodes by simply creating an AppArmorProfile object. However, support to bind profiles to workloads and generate them automatically is not yet available.

# Security Profile Operator AppArmor Profile custom resource
apiVersion: security-profiles-operator.x-k8s.io/v1alpha1
kind: AppArmorProfile
metadata:
name: test-profile
annotations:
description: Block writing to any files in the disk.
spec:
policy: |
#include <tunables/global>

profile test-profile flags=(attach_disconnected) {
#include <abstractions/base>
# Allow file access
file,
# Deny all file writes.
deny /** w,
}

Alternatively another custom operator, the kube-apparmor-manager, also implements an AppArmorProfile CRD . This implements similar support as the security profile operator, but syncs profiles to nodes using SSH on-demand using ./kube-apparmor-manager sync.

# Kube AppArmor Manager AppArmor Profile custom resource
apiVersion: crd.security.sysdig.com/v1alpha1
kind: AppArmorProfile
metadata:
name: k8s-apparmor-example-deny-write
spec:
enforced: true
rules: |
# read only file paths
file,
deny /** w,

AppArmor Profile Generation. To generate baseline profiles for applications AppArmor implements the aa-genprof utility command. To start, we run aa-genprof with a path to the executable to profile and run the workload in another shell. On termination aa-genprof parses the system logs for complain mode entries and allows us to choose which permissions to add to the profile.

$ sudo aa-genprof
[...]
Profiling: /usr/bin/kubectl
[...]
[(S)can system log for AppArmor events] / (F)inish
> S

Reading log entries from /var/log/syslog.
Updating AppArmor profiles in /etc/apparmor.d.
Complain-mode changes:

Profile: /usr/bin/kubectl
Path: /sys/kernel/mm/transparent_hugepage/hpage_pmd_size
New Mode: r
Severity: 4

[1 - #include <abstractions/lxc/container-base>]
2 - #include <abstractions/lxc/start-container>
3 - /sys/kernel/mm/transparent_hugepage/hpage_pmd_size r,
(A)llow / [(D)eny] / (I)gnore / (G)lob / Glob with (E)xtension / (N)ew / Audi(t) / Abo(r)t / (F)inish
> AIn

For more complex environments or programs with longer run times its common to load initial profiles in complain mode and then update these profiles periodically. The tool aa-logprof parses the system logs for complain events and updates the appropriate profile. Once satisfied profiles can then be put into enforce mode.

Whilst not quite as granular or automatic as the native AppArmor utilities, Bane is simpler, open source tool used to build profiles for docker containers specifically. It takes a TOML configuration file and generates a profile.

Name = "nginx-example"

[Filesystem]
ReadOnlyPaths = [ "/root/**" ]
LogOnWritePaths = [ "/**" ]
WritablePaths = [ "/**" ]
AllowExec = [ "/usr/sbin/nginx" ]
DenyExec = [ "/bin/dash" ]

[Capabilities]
Allow = [ "setuid", "net_bind_service" ]

[Network]
Raw = false
Packet = false
Protocols = [ "tcp" ]

Bane then generates a profile at /etc/apparmor.d/containers/ and loads it automatically.

$ bane profile.toml
$ cat /etc/apparmor.d/containers/docker-nginx-example
#include <tunables/global>

profile docker-nginx-example flags=(attach_disconnected,mediate_deleted) {
#include <abstractions/base>

network inet tcp,
deny network raw,
deny network packet,

[...]
$ sudo apparmor_status | grep -i nginx
docker-nginx-example

To test, Docker’s --security-opt flag can then be used to specify non-default (AppArmor) profiles.

$  sudo docker run --rm -it --security-opt apparmor=docker-nginx-example nginx

--

--