Docker Volumes

Ridwan Shariffdeen
Docker Captain
Published in
4 min readJan 8, 2017

So far the most complicated concept to understand with the over-whelming number of terminology docker introduces, is docker volume. Traditionally a volume is a logical drive which is a storage area accessible within a file system and typically resident in a partition of a hard disk. In this article, I will be explaining the docker concept of volumes and how it is useful to persist data in your containers.

First, lets take a look at how docker works, in my previous articles I explained the high-level technology of Docker and the primary concepts of Docker which defines the rest of the terminology. Docker images are the base of everything and it’s a stack of read-only layers where in each layer a Docker command is recorded. When we start a container from such a image, Docker engine takes the read-only stack of layers and add a read-write layer on top of it where the changes are applied. It also make sure that the changes in read-write layer hides the underlying original file in the read-only layer. As usual Docker has a name for this technology too, Union File System; combination of read-only layers with a read-write layer on top.

Within the Union File System, Docker came up with the concept called volumes, which enabled us to persist data and share between containers. They are foreign objects out of the Union File System which can be mounted with either read-only or read-write permissions.

Docker volume can be looked at in many ways, the primary way to categorize volumes are

  • Data Volume
  • Data Volume Container.

There are three main use cases for Docker data volumes

  1. To keep data around when a container is removed
  2. To share data between the host filesystem and the Docker container
  3. To share data with other Docker containers

Let’s take a look how we can make use of volumes to persist data. The easiest way to achieve this is to create volumes from the host machine. To demonstrate the idea, I will be using Ubuntu:14.04 image. Let’s spin up a container and list the directory in /var

docker run Ubuntu:14.04 ls /var
Content inside /var of the container

Now lets, mount a directory in our host system to /var and run the same command as above. I created a sample directory with a hello.txt document inside it to demonstrate the example.

$docker run  -v ~/sample:/var ubuntu:14.04 ls /varhello.txt

As you can see the content inside the /var directory of the container we ran is the same as the content in the new directory we created. This way anything the container writes to the /var directory will be persisted as long as the content exist in ~/sample directory we mounted on. Additionally we can specify if the volume in read-only or read-write mode to prevent unwanted modifications.

This is very useful to modify configuration files, data files and installation files inside containers which needs instant updates and dynamically loaded into the container. As we know contents inside docker container is the same as instantiated from the image. This way we can freeze the programs we don’t want to be modified later by the user and allow only certain programs to be altered at run-time.

Now, let’s have a look at the more complex idea of docker volumes using another container. Yeah! that’s right, we can use a container as a shared medium between multiple containers without running any program on top of it. As I explained in my previous articles, a container would be live only as long as the primary/initial program (PID 1) is running. Since data containers doesn’t require any program, it will not be live, but will get the job done. How to build a good data container is a question that is repeatedly asked in the docker community. The main objective of a data container is to store data and only that. It shouldn’t run any application, it should be small and simple that would not cause problems at the production level which would be hard to troubleshoot.

Let’s have a look how we can make use of a data container, on a docker eco-system. For this example, I will be using the BusyBox image (BusyBox is a very good ingredient to craft space-efficient distributions) which is most light-weight container. First, we spin up a data container with a directory named foo

docker run -v /foo --name data-container busybox

Now lets spin up a Ubuntu container and create a file named bar.txt inside /foo directory which is mounted in data-container volume.

docker run --volumes-from data-container ubuntu:14.04 touch /foo/bar.txt

Finally, lets spin up another container with data-container volume so we can list the content of /foo directory. Remember original Ubuntu container doesn’t have a directory named /foo and the busy box didn’t have a bar.txt file in it.

docker run --rm --volumes-from data-container ubuntu:14.04 ls /foo

You will see the output of /foo directory with a file named bar.txt which we created earlier using Ubuntu container. So, the idea is very simple we can use multiple containers but one container to share data between them. This comes pretty handy when you need to backup data and remove the tedious task of running multiple commands in all your containers. Using docker commands like export and import we can create backups of the data-container and migrate to different servers easily. More on the application later. For now, my intention was to give you an idea of what docker volume is and how we can leverage the benefit of it.

--

--