Drooling Over Docker — #2. Understanding Union File Systems

To understand the composition of Container images better, let us take a detour and learn about Union File Systems first.

File Systems and Mounting in Unix/GNU Linux

In Unix (and GNU Linux), everything is a file i.e. — apart from regular data files, even system devices are exposed through a file system name space, e.g. a hard disk can be seen as a file named sda in the directory for device files i.e. /dev and can be accessed through its absolute path being /dev/sda.

Even disk partitions are seen as device files within rootfs e.g. /dev/sda1 is the first primary partition of disk represented by /dev/sda1. Such a partition is further formatted with a file system driver (etx3, ext4 etc.) so that it can store files and directories within it.

Now, if we have a formatted (have a file system) partition /dev/sda1 that has some data stored on it in files organized within a few sub-directories and we want to read from or write to those files — we will have to attach this file system (on file /dev/sda1) to some directory in the logical file system tree. This process is known as mounting (to an existing directory) a file system.

#mount -t ext4 /dev/sda1 /mnt

Union File System — what is that!?

The keyword here is UNION as in SET THEORY.

If you experiment and mount two separate file systems at the same mount point, one after the other, you’ll only get to see the files from the file system that was mounted last.

Try out these commands in your Linux terminal and see for yourself -

Unlike simple mount option, a union mount would provide a UNION of both the file systems mounted at the same mount point. -

After these examples, now consider that you have a read-only file system and you want to modify a certain file in there so that you can go ahead with your computing needs — on the lines of above mentioned example, Union File System can help us here. We can create another read-write file system either on disk or in RAM as the case may be, and mount both these file systems to another mount point using Union File System. Now, this mount point can give you access to all the files in both ro and rw file systems. In case, you want to modify any of the files residing on the ro file system, Union File System driver would search for that file and perform a CoW (Copy on Write) to make another copy of the file in rw file system that overrides the copy that exists on ro file system. This newly created copy is finally updated with the new contents. Any new files as part of software installation would also go in to the rw files system.

Please check Link1 and Link2 for some very good examples on the points discussed so far. I have also given these links in the resources section in the end of the article.

A use case for Union File System — Knoppix

Again, what if the file system we attempting to mount is read-only (e.g. from a CD ROM) and we intend to change its contents by editing/removing existing files or adding new files to the file system. Is it possible?

Let us understand another example from Knoppix — a Live CD version of Linux that could boot your machine and allow you to work on your system. You could change system settings or download additional software when the Knoppix OS was running in memory and even save these changes for subsequent runs.

The possibility of making Knoppix settings persistent allowed Knoppix to be a good portable desktop OS — all it required was a Live Knoppix CD and a USB drive. You could boot system through Knoppix and load changed settings from the USB drive and you could get the same desktop environment on any machine.

With the support from Union File Syems (Knoppix 3.9 brought in UnionFS and its later versions used aufs), Knoppix mounted multiple filesystems on top of each other ( in the logical space as mounting happens in logical space) i.e. here mounting rw RamDisk on the top of ro CD file system forming a UNION of both these file systems.

If you write to a file that is in the ro area, the aufs driver would copy it in the rw area and perform the write operation. When next time you access the file,it accesses the modified version of the file from the rw area (which hides the same named file in the ro area). You work on it transparently and aufs does the work for you — you keep working as it it was a writable system.

How Union File Systems help Docker Containers

Docker Containers bring you immutable (unchanging over time) software in the form of layers. During build time, you stack multiple such immutable layers of software to get the desired applications with their dependencies. Many a times there are software layers that override the functionality given by the lower layers in the stack. This is only possible by implementing a Union File System. Also, at run time, you download some additional software within the container that is the newer version of some software; such a case would trigger CoW and have the updated copy of it written in the rw layer of the software stack. Even any newly installed software would settle in this very rwtop layer of the stack. I’ll discuss more about Docker Layers in the next chapter…

Contd…

— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — —

Resources:

  1. DON’T MISS THIS VIDEO FROM VMware about Containers
  2. Knoppix 3.8 and UnionFS. Wow. Just Wow. by Kyle Rankin
  3. Knoppix Hacks: Tips and Tools for Hacking, Repairing, and Enjoying Your PC — Hack#25
  4. Docker Storage: An Introduction
  5. Union file systems: Implementations, part I
  6. Digging into Docker layers
  7. Docker Container’s Filesystem Demystified
  8. Lightweight Virtualization LXC containers & AUFS
  9. Why to use AuFS instead of UnionFS
  10. Union Filesystem — FreeBSD
  11. Linux AuFS Examples: Another Union File System Tutorial
  12. Why does Docker need a Union File System
  13. Manage data in Docker
  14. Select a storage driver
  15. Use the AUFS storage driver
  16. Use the BTRFS storage driver
  17. Use the Device Mapper storage driver
  18. Use the OverlayFS storage driver
  19. Filesystems in LiveCD by Junjiro R. Okajima
  20. AuFS2 — ReadMe
  21. AuFS4 — ReadMe
  22. AuFS — Ubuntu Man Page
  23. AUFS: How to create a read/write branch of only part of a directory tree?
  24. Unionfs: User- and Community-Oriented Development of a Unification File System
  25. UnionMount and Union-type Filesystem (Google Translated from Japanese)