Mount volumes into a running container
This post uses an adaptation of jpetazzo’s technique in Attach a volume to a container while it is running from 2015.
If you’ve used Docker before, you probably know that you can only mount volumes when a container is first created. After that, the container’s namespaces are isolated from the host.
When would it be useful to mount into an running container? For example, when you’re debugging a container, you might need to copy over your debugging tools. If you mount a volume that contains your tools, you don’t actually have to copy any files. It works for any situation where you need some files in a container but would prefer not to copy them.
Trying to Bypass Containers’ Isolation
After the container is created, its namespaces are isolated from the host. However, there are still some hacks that let the host interfere with the container. One of them involves the container’s filesystem layers:
Docker containers use a layered filesystem. When you build on a base image, you create a new layer on top of the base image’s layers. Instead of re-encoding the entire contents of the filesystem, this top layer is just a diff. It shows only the changes you make — what was added, deleted, overwritten. When the container runs, its filesystem merges these layers into a single coherent view.
You can actually find some of a container’s layers in the host filesystem. Let’s have a look:
# on the host, list the containers' layers:
$ ls /var/lib/docker/overlay2 # or aufs (another layered fs)180cea7d0759cc309980741d3e97ed5f32e0a1a0bf4cb26c37e81edb5cbcac4a
96e12f3ab10e386340fe4856abc44a7fed32825d533b5e113931d0e02cd59248
96e12f3ab10e386340fe4856abc44a7fed32825d533b5e113931d0e…cd59248-init
bd88758109d813229c2af778fa4e1cf2e7461a4b0019b25f3b2073417a04ea74
d1aa0d3f900d4054ef7057dca51df127c82113297c96aedc5708654739add755
d80890c702bdcea018f3a7d1a4daeeae639d6986675928e6253bbf4f86a508cd
f3bf1b56c71f918b89a1a399014f260d2d48b41407bd63a8ccb0b49d0f688c91
l# look inside one of the layers:
$ ls /var/lib/docker/overlay2/180cea7d…cbcac4a/diff/bin boot dev etc home lib lib64 media mnt opt …
This is /var/lib/docker/overlay2
on the host filesystem, but each of those layers is part of a container’s filesystem. The bin boot dev etc home …
above belong to a running container. For example:
# on the host, create a file 'doot' in a layer:
$ touch /var/lib/docker/overlay2/180cea7d…cbcac4a/diff/doot# in the container, 'doot' appears:
$ ls /bin boot dev doot etc home lib lib64 media mnt opt …
Note: This isn’t something you should do. When you docker exec
and make changes inside a container, you only modify the top layer of the filesystem. Here, we’re modifying lower layers, which is equivalent to directly modifying the image — an anti-pattern.
Nevertheless, let’s see how we can use this hole in the container’s isolation. Can we give the container a link to a directory on the host?
# on the host, make a link to the directory we want:
$ ln -s /home/ubuntu/test /var/lib/docker/overlay2/180c…4a/diff/doot# in the container:
$ ls -l /dootdoot -> /home/ubuntu/test
That link won’t work in the container. It points to /home/ubuntu/test
, which doesn’t exist in the container’s filesystem. In retrospect, it’s no surprise that a plain symlink wouldn’t point to a different filesystem. We need something more powerful.
We’ll use a bind-mount. Typically, you mount a device at a directory. Bind-mounting is when a directory is mounted at another directory:
# on the host, mount the target directory at `doot`:
$ mount -o bind /home/ubuntu/test \
/var/lib/docker/overlay2/180c…4a/diff/doot# on the host:
$ ls /var/lib/docker/overlay2/180c…4a/diff/doot< contents of /home/ubuntu/test >
# in the container:
$ ls /doot# it's empty!
Mounts are namespaced. We created our bind-mount on the host (the first command), so it’s only available in the host’s mount
namespace. As long as we’re on the host, we’ll see what we mounted at doot
(the second command). When we enter the container, it’s the same doot
directory, but we use the container’s mount
namespace, which doesn’t have the mount. Hence, /doot
is empty (the third command).
A bind-mount created outside the container is not visible inside the container. This is a property of mount namespaces.
Here’s the conundrum:
- The host only creates
mount
s in its own namespace. - The container can’t access the host’s filesystem to create a bind-mount from it.
Each side (host vs container) is missing a piece we need. The key insight of jpetazzo’s approach is this:
The container has everything it needs (except permissions) to create a block device file and mount the entire host filesystem from it.
If we can mount the entire host filesystem in the container, we’ve solved the second point above. i.e. If the host filesystem is mounted at /hostfs
in the container, we can bind-mount /hostfs/home/ubuntu/test
at /doot
!
Mounting the Host Filesystem
The first step is to figure out which filesystem contains /home/ubuntu/test
. Use df
for that:
# on the host:
$ df /home/ubuntu/testFilesystem 1K-blocks Used Available Use% Mounted on
/dev/xvda1 8065444 1247724 6801336 16% /
It’s the filesystem mounted at /
. We check /proc/self/mountinfo
to see which subdirectory of that filesystem is mounted at /
:
# on the host, look for '/' in the MOUNT (5th) column:
$ less /proc/self/mountinfo…
23 0 202:1 / / rw,relatime shared:1 - ext4 /dev/xvda1 \ rw,discard,data=ordered
…
# _ _ MAJOR:MINOR SUBROOT MOUNT ...
# the third column is the device number: try 'stat /dev/xvda1'
# the fourth column is the subdirectory mounted at '/'# some of this info is also available in /proc/self/mounts
We also found out which device we need to create — 202:1
, a.k.a. /dev/xvda1
# in the container, create the device if it doesn't already exist:
$ [ -b /dev/xvda1 ] || mknod --mode 0600 /dev/xvda1 b 202 1
Now we just have to mount this new device inside the container:
# in the container:
$ mkdir -p /tmpmount
$ mount /dev/xvda1 /tmpmountmount: permission denied
We want to create the mount inside the container’s namespace, but we need permissions from the host user. Enter ns-enter
, which allows us to enter the container’s namespaces as the host user:
# on the host, get the container's PID:
$ docker inspect --format {{.State.Pid}} <container_name_or_ID>
4417# from the host, mount the volume inside the container's namespaces:
$ nsenter --target 4417 --mount --uts --ipc --net --pid -- \
mount /dev/xvda1 /tmpmount
# in the container:
$ ls /tmpmountbin boot dev etc home initrd.img lib lib64 …
Note: --mount --uts --ipc --net --pid
are all namespaces we want to enter. We don’t use --user
because we don’t want to become the container’s user — we need the host user’s permissions.
Bind-Mounting the Subdirectory
Now, the container has access to the whole filesystem that contains /home/ubuntu/test
, so a plain bind-mount is all we need:
# in the container:
$ mount -o /tmpmount/home/ubuntu/test /doot
Note: Earlier, we found that /home/ubuntu/test
was in the /
subdirectory of its filesystem (the SUBROOT
column of mountinfo
). If we’d found that it was in /somedir
, this command would look like:mount -o /tmpmount/somedir/home/ubuntu/test /doot
To clean up after ourselves, we unmount the whole host filesystem. The bind-mount is unaffected:
# on the host:
$ nsenter --target 4417 --mount --uts --ipc --net --pid -- \
umount /tmpmount# in the container:
$ ls /doot< contents of host's /home/ubuntu/test >
That’s it! Hopefully that provides some insight into how containers and namespaces work in practice.
Please check out jpetazzo’s original article. It has a few tidbits I didn’t include here. In response to his last comment: The technique does seem to work in the cloud. The steps I laid out in this post, I actually did on an AWS EC2 instance.