The size of my node_modules scales as E=mc2, mass usually being a black hole density folder.

How to inspect volumes size in Docker

MrManafon
Homullus
Published in
6 min readAug 8, 2020

--

A short story today. I have this huge project, like 10ish+ containers and 40+ volumes (in local dev ofc.). Sometimes, for some users, some of the volumes get huge. Like real huge.For instance, the today’s example is a Symfony Web Profiler datastore non-bound volume that took up 60GB instead of 7.

Find what is taking up the space

First off, we need to figure out the curlprit. What if it isn’t the volume at all? People who used older versions of Docker for Mac might remember a time where old container layer caches were not cleaned and over time fill up the machine, so we all had to add something like this to our ./deploy.sh scripts:

# Register shutdown handler
function gracefulShutdown ()
{
docker-compose $COMPOSE_LIST down
docker system prune -f && docker image prune -f
exit 2
}
trap "gracefulShutdown" 2

Using built in docker df will display the general state of the Docker VM. We can look into the details later, but this gives us a very good overview of the current state of the VM.

$ docker system dfTYPE                TOTAL               ACTIVE              SIZE             
Images 11 3 2.697GB
Containers 3 2 240.9kB
Local Volumes 65 0 58.18GB
Build Cache 0 0 0B

From this, at least in our case, it is very obvious that volumes are taking the most out of our limit of 64GB. (Default Docker for Mac VM disk limit is 64G, yours may vary)

Figure out which volume is taking up space in your Docker VM

If our project is not started, you will have to do it manually. Its not hard by any means, but people who may have more than 100 volumes might have issues scrolling through the list.

$ docker system df -vVOLUME NAME    SIZE
docker_macola-frontend-node 130.4MB
ebd9b8f25f1ba921eaf123a50e2 0KB
website_ag-website-php-src 52.7GB
...
...

Of course my list contains 65 volumes, most of which are a sea of hashes. So it might hurt your eyes while you figure out which one is the largest, without any kind of sorting mechanism.

Bonus points: If your project is started.

If a project that you suspect is already started, you can use the following script i just wrote to display a more detailed version of the information that may pinpoint the faulty volume immediately:

Inspect the Docker Volume content manually

Ok, we know the name of the problematic volume, and we may just delete it at once.

$ docker volume rm website_ag-website-php-src

But in our specific case, this was a reoccurring issue, albeit rare, so i really wanted to peek into the file structure and see what exactly is it that takes up so much storage space.

$ docker volume inspect website_ag-website-php-src[
{
"CreatedAt": "2020-08-03T16:09:40Z",
"Driver": "local",
"Labels": {
"com.docker.compose.project": "website",
"com.docker.compose.version": "1.25.5",
"com.docker.compose.volume": "ag-website-php-src"
},
"Mountpoint": "/var/lib/docker/volumes/website_ag-website-php-src/_data",
"Name": "website_ag-website-php-src",
"Options": null,
"Scope": "local"
}
]

The command above will use the built in volume inspector tool to draw out the mount information from within the VM. Note the “Mountpoint” property, we will need it for the next step.

All the volumes that our containers use, are bound between the Docker VM and its containers. In case of D4M’s xHyve or Virtualbox volumes can at the same time also be bound to the host filesystem — synchronizing your files into the containers. Obviously, since this is a datastore we didn’t do that here, so we have absolutely no access to the volume data from the host.

We need to ssh into xHyve and inspect the filesystem from inside. If you are on Linux or Windows, your setup will vary, and command will definitely differ.

sudo screen ~/Library/Containers/com.docker.docker/Data/vms/0/tty
Usage of a pretty inspector like ncdu

Now, as of writing this in August 2020, the method of using screen simply does not work. Why? Don’t ask me, it just says that permissions to read tty are bad. My best guess is that it is related to SIP being enabled on my machine.

Thats a pitty because i really wanted to install ncdu inside and have a pretty view for the readers, but i didn’t really care about it at this point. Instead, we can spin up a new Debian based container that will share the filesystem with the VM via nsenter:

$ docker run -it --privileged --pid=host debian nsenter -t 1 -m -u -n -i bash

It will spin up and run the interactive shell inside the VM/Container. We now have full access to the Docker VM filesystem and can use standard debian tools. If you would like to learn more about nsenter or the docker image i mentioned above, jump here:

Back to topic: navigating the volumes inside Docker VM

First, get friendly with using the du command:

$ du -d 1 -h /## Let's break it down:
# du
## the name of the command.
# -d 1
## defines how deep the report should print
## if set to 0, it will display size of the
## if set to 1, will display the contents
# -h
## Just orders du to use human-readable formats
# /
## A path to the dir we are inspecting.

Ok, now that we know that, lets take the full path we got from the previous step, and look it up.

$ du -d 0 -h /var/lib/docker/volumes/website_ag-website-php-src/_data52.6G /var/lib/docker/volumes/website_ag-website-php-src/_data$ du -d 1 -h /var/lib/docker/volumes/website_ag-website-php-src/_data52.6G
4.0K /.../_data/translations
6.0M /.../_data/uploads
13.4M /.../_data/public
656.0K /.../_data/tests
4.0K /.../_data/vendor
120.0K /.../_data/templates
52.4G /.../_data/var
13.5M /.../_data/app
136.1M /.../_data/src
284.0K /.../_data/assets
20.0K /.../_data/features
16.0K /.../_data/bin
648.0K /.../_data/config
28.0K /.../_data/web
$ du -d 2 -h /var/lib/docker/volumes/website_ag-website-php-src/_data/var/cache52.6G
6.5M /.../_data/var/cache/dev/ContainerO42sqde
28.0K /.../_data/var/cache/dev/settings
2.0M /.../_data/var/cache/dev/translations
388.0K /.../_data/var/cache/dev/jms_security
54.5M /.../_data/var/cache/dev/pools
6.6M /.../_data/var/cache/dev/ContainerZod8lmp
16.0K /.../_data/var/cache/dev/complaintTotalAmountReturned
2.6M /.../_data/var/cache/dev/doctrine
4.0K /.../_data/var/cache/dev/doctrine_fs_cache
84.0K /.../_data/var/cache/dev/jms_aop
8.0K /.../_data/var/cache/dev/forum
52.6G /.../_data/var/cache/dev/profiler
4.0K /.../_data/var/cache/dev/jms_serializer
4.5M /.../_data/var/cache/dev/twig

Interesting, since this is a containerized Symfony project, we would expect the cache:clear command to easily clean up the whole var/cache but in fact it seems that some segments of the cache are not cleaned up. Such is the web profiler Symfony component which seems to ignore cache:clear’s cries for cleanup.

There is not much we can do about it, that seems to be expected behaviour of the component for some reason that eludes me, and there are no output settings, which is also weird, since most projects will never require any output to disk.

Conclusion: Parametrising the Symfony Web Profiler for more free space but also better performance.

It seems to prevent the profiler from storing profiles of non-exception requests. For some use cases this will be fine, yours might be it.

On the other hand, i want my developers to be able to use the profiler at whim, without any restrictions, while also leaving it off for most of the requests. This will mean parametrising its values and mapping them to ENV:

parameters:
profiler_enabled: '%env(bool:SYMFONY_PROFILER)%'
web_profiler:
toolbar: '%profiler_enabled%'
intercept_redirects: false
framework:
profiler:
enabled: '%profiler_enabled%'
only_exceptions: false

This will solve the issue of profiler being on at all times and slowing down dev, or saving stuff to disk. Still, what about the cleanup process, once developers actually use the profiler?

$ rm -rf var/cache/dev/profiler

Well, it might be primitive, but i just added this command to our cache:clear segment of the deploy process (on bootup, before composer install).

After doing all of this, the issue at hand was solved without any of my devs noticing, kind of, but more importantly along the way i learned how to inspect docker volumes, which will help me in the following days to figure out why WiredTrigger, engine of MongoDB uses so much swap space on its datastore volume.

--

--

Homullus
Homullus

Published in Homullus

I explain how i did stuff, and you (hopefully) give me your input on where I got it wrong.

MrManafon
MrManafon

Written by MrManafon

Softwareingeniør, Arkitekt og Lektor

Responses (1)