ubuntu:21.10 and fedora:35 do not work on the latest Docker (20.10.9)
If you try to run ubuntu:21.10 on the latest Docker (20.10.9), you will face wreak havoc:
$ docker run -it --rm ubuntu:21.10
root@862f014171b5:/# apt-get update
Get:1 http://security.ubuntu.com/ubuntu impish-security InRelease [90.7 kB]
...
Reading package lists... Done
E: Problem executing scripts APT::Update::Post-Invoke 'rm -f /var/cache/apt/archives/*.deb /var/cache/apt/archives/partial/*.deb /var/cache/apt/*.bin || true'
E: Sub-process returned an error codeAnd you can’t run fedora:35 , either:
$ docker run -it --rm fedora:35
[root@849f3703c4b5 /]# dnf install -y hello
Fedora 35 - x86_64 0.0 B/s | 0 B 00:00
Errors during downloading metadata for repository 'fedora':
- Curl error (6): Couldn't resolve host name for https://mirrors.fedoraproject.org/metalink?repo=fedora-35&arch=x86_64 [getaddrinfo() thread failed to start]
Error: Failed to download metadata for repo 'fedora': Cannot prepare internal mirrorlist: Curl error (6): Couldn't resolve host name for https://mirrors.fedoraproject.org/metalink?repo=fedora-35&arch=x86_64 [getaddrinfo() thread failed to start]Old versions of containerd/CRI (before v1.5.6/v1.4.10), Podman, and CRI-O were affected by the same issue, too.
Why?
This is because the default seccomp profile of Docker 20.10.9 is not adjusted to support the clone() syscall wrapper of glibc 2.34 adopted in Ubuntu 21.10 and Fedora 35.
The new clone() syscall wrapper introduced in glibc 2.34 tries to call the clone3() syscall before calling the real clone() syscall. If clone3() returns ENOSYS (“Function not implemented”) error, glibc falls back to the legacy behavior that calls the real clone() syscall. However, if clone3() returns other errors, glibc fails immediately without calling the real clone() syscall.
The problem is that Docker 20.10.9 is not aware of clone3() , and Docker injects the SCMP_ACT_ERRNO(EPERM) rule for all syscalls that are unknown to Docker. So, when glibc attempts to call clone3() , the kernel raises EPERM (“Operation not permitted”) error according to Docker’s seccomp profile. Thus glibc fails.
The right solution
The right solution is to upgrade Docker to 20.10.10 or later.
Docker 20.10.10 is NOT released yet as of the time of writing (October 18, 2021), but probably it will be released in a just couple of days.
The fix has been already committed to the master branch and the20.10 branch of the upstream github.com/moby/mobyrepo, so you can opt to bother to compile it by yourself if you can’t wait for the 20.10.10 release.
Also, some distribution vendors have already cherry-picked the fix to their packages ahead of the 20.10.10 release.
e.g., Ubuntu package of docker.io/20.10.7 has been already patched to fix the issue.
Update (Oct 26, 2021)
Docker 20.10.10 is now available ( https://get.docker.com/ )
Workaround 1: `--security-opt seccomp=unconfined`
A workaround without updating Docker is to disable seccomp:
docker run --security-opt seccomp=unconfined -it --rm ubuntu:21.10However, this workaround have several drawbacks:
- Insecure
- Does not work when you are not allowed to modify the
--security-optflags - Does not work for
docker build
Workaround 2: `SHELL [“/clone3-workaround”, …]`
I wrote https://github.com/AkihiroSuda/clone3-workaround for providing a workaround that is free from the drawbacks of the Workaround 1.
This program loads an additional seccomp profile that hides the existence of clone3()syscall from glibc by injecting anSCMP_ACT_ERRNO(ENOSYS)rule, so that the clone()wrapper of glibc works in the legacy-compatible mode.
The usage is easy. Just download (or compile) the binary, and mount it into the container, and run /clone3-workaround COMMAND [ARGUMENTS...] .
$ docker run -it --rm -v $(pwd)/clone3-workaround:/clone3-workaround ubuntu:21.10 /clone3-workaround bash
root@490fd2f29a88:/# apt-get update
Get:1 http://security.ubuntu.com/ubuntu impish-security InRelease [90.7 kB]
...
Fetched 19.4 MB in 6s (2996 kB/s)
Reading package lists... Done
root@490fd2f29a88:/# apt-get install -y hello
Reading package lists... Done
...
Unpacking hello (2.10-2ubuntu3) ...
Setting up hello (2.10-2ubuntu3) ...To use with docker build , set SHELL ["/clone3-workaround","/bin/sh","-c"] in your Dockerfile, and just run docker build .
Dockerfile (Ubuntu 21.10)
FROM ubuntu:21.10
ADD https://github.com/AkihiroSuda/clone3-workaround/releases/download/v1.0.0/clone3-workaround.x86_64 /clone3-workaround
RUN chmod 755 /clone3-workaround
SHELL ["/clone3-workaround", "/bin/sh", "-c"]
RUN apt-get update && apt-get install -y helloDockerfile (Fedora 35)
FROM fedora:35
ADD https://github.com/AkihiroSuda/clone3-workaround/releases/download/v1.0.0/clone3-workaround.x86_64 /clone3-workaround
RUN chmod 755 /clone3-workaround
SHELL ["/clone3-workaround", "/bin/sh", "-c"]
RUN dnf install -y helloHow can we prevent this from happening again?
If we could change the default rule of the seccomp profile from SCMP_ACT_ERRNO(EPERM) to SCMP_ACT_ERRNO(ENOSYS) , we could avoid these kinds of issues.
Several folks including Aleksa Sarai of SUSE have been proposing this change to the Docker/Moby community, but it may take some time to land: https://github.com/moby/moby/issues/42871
NTT is hiring!
We NTT are looking for engineers who work in Open Source communities like Docker/Moby, containerd, Kubernetes, and their relevant projects. Visit https://www.rd.ntt/e/sic/recruit/ to see how to join us.
私たちNTTは、Docker/Moby、 containerd、Kubernetes などのオープンソースコミュニティで共に活動する仲間を募集しています。ぜひ弊社採用情報ページをご覧ください: https://www.rd.ntt/sic/recruit/







