‘ENOSYS’ Error in the Kubernetes pods

geekidea
2 min readMar 26, 2020

--

Couple months ago our docker application encountered some errors after starting the corresponding pods in the production kubernetes nodes, attached the application process by ‘strace’, collected the process’s syscall activities on the node host, via: strace -f -s 1024 -ttT -p $(pidof app) -o /tmp/app.st , and cancelled attaching (CTRL+C in the ‘strace’ session) after a short while (10+ secs), the log file /tmp/app.st showed a lot more system errors ‘ENOSYS’:

ENOSYS strace snip

as you can see most of syscalls (e.g. readv, open, nanosleep) failed with the error ‘ENOSYS’, and the application can’t connect to kafka server as it failed to query the DNS A record.

At first we suspected that the dependencies (libc shared libraries) changed due to the mismatching between the building environment and the runtime environment, we asked the ops & developer, and eliminated the suspection.

Then we thought it was related to some sandboxing security settings. docker and kubernetes leverage the main security mechanisms: linux capability, seccomp, SELinux. for the ENOSYS with such syscalls, seccomp can create a sandbox running limited syscalls to reduce the attack surface, before starting the application.

docker implements a seccomp wrapper, it takes a json file which contains the seccomp rules (e.g. CPU arch scopes, allow or disallow action for syscalls), and the docker component ‘containerd’ is using the library libseccomp-golang (libseccomp c wrapper) to transcode as the bpf binary code and apply the bpf code for the application.

In this case the application pod applied the incorrect seccomp json file which dedicated to the penetration testing in dev kubernetes cluster, and the seccomp json file was like this:

{
"defaultAction": "SCMP_ACT_ERRNO",
...
"syscalls": [
{
"names": [
"socket",
"bind",
"connect",
"sendto",
"read"
],
"action": "SCMP_ACT_ALLOW",
"args": [],
"comment": "",
"includes": {},
"excludes": {}
},
{
"names": [
"nanosleep",
"close",
"readv"
],
"action": "SCMP_ACT_TRACE",
"args": [],
"comment": "",
"includes": {},
"excludes": {}
},
...
]
}

The sample seccomp json file above showed the only a few common syscalls (e.g. socket, sendto, read) were allowed, and other syscalls like nanosleep, close, readv should be traced by tracer, if nonexistence of tracer such syscalls would fail with the error ENOSYS, for the default action ‘SCMP_ACT_ERRNO’ in the sample json file it meant other syscalls NOT whitelisted here would fail to be called with the error EPERM.

So for this case we should pass the slightly modified seccomp json file (whitelisting nanosleep, close, readv allowed) to the kubernetes deployment.

--

--