Stop running an application inside a Docker container as the root user
A docker blog post indicates,
Docker containers are, by default, quite secure; especially if you take care of running your processes inside the containers as non-privileged users (i.e. nonroot).”
When you run as root, you can access a broader range of kernel services. For instance, you can:
- read/write/delete/modify system files, resources
- snoop at what your programs are doing internally
- manipulate network interfaces, routing tables, netfilter rules;
- mount/unmount/remount filesystems;
- shutdown/remove machine
- change file ownership, permissions, extended attributes, overriding regular permissions;
- do a lot;
The main point here is that as root, you can exercise more kernel code; if there is a vulnerability in that code, you can trigger it as root, but not as a regular user. Additionally, if someone finds a way to break out of a container, regardless of who you were inside the container, you would break out as who the LXC process (docker no longer use LXC as default driver, checkout official blog, https://blog.docker.com/2014/03/docker-0-9-introducing-execution-drivers-and-libcontainer/) itself is running as on the host OS. From the official repo, the docker daemon binds to a Unix socket instead of a TCP port. By default that Unix socket is owned by the user root and other users can only access it using sudo. Saying that the docker daemon always runs as the root user. This means if you break out from the container you are breaking out as root user.
Well, of-course docker from the application perspective which is running inside docker, even if you run as root, docker or essentially LXC containers (behind the scenes, docker uses
lxc-start to execute the Docker container) trying to address this and other permission related concerns using Kernel Namespaces. Anyone familiar with
chroot already has a basic idea of what Linux namespaces can do and how to use namespace generally.
Namespaces provide the first and most straightforward, form of isolation. Because of Linux namespaces, it became possible to have multiple “nested” process trees. Each process tree can have an entirely isolated set of processes. This can ensure that processes belonging to one process tree cannot inspect or kill — in fact, cannot even know of the existence of — processes in other siblings or parent process trees.
Linux namespaces allow other aspects of the operating system to be independently modified as well. This includes the process tree, networking interfaces, mount points, inter-process communication resources and more.
Still, security basic rule, avoid granting unnecessary permission. So first thing first, avoid user granting root access for docker container and always run docker containers with the -u flag so that they run as an ordinary user. And for that ordinary user grant only required permission. For example, if your dockerfile contains something like these lines,
echo "tomcat ALL=(ALL) NOPASSWD:ALL" >> /etc/sudoers