Docker’s Strange Love or: How I Learned to Stop Worrying and Love Containers

Dan Borden
Yello Offline
Published in
5 min readSep 11, 2017
Every article about containers must include a picture like this

Recently, when I was renovating my living room, the handyman with whom I was working offered some sage advice: “Home repair is mostly just about knowing the right tool for the job.” With that in mind, I’d like to discuss a technology that has recently become a buzzword among the ultra-hip Devops cognoscenti: containerization. Within this article, I will be exploring virtualization vs containerization within the context of providing environments or servers for development and testing. I won’t be discussing the use of VMs or containers in production, or the use of these in the cloud.

As a quick overview, containers are a way of dividing up computational resources within a system, similar to traditional virtual machines. But while virtual machines isolate an entire system (virtualizing both the software and hardware of a computer), containers merely divide the kernel into separate and (almost) completely isolated user spaces. The result is roughly analogous to an FTP server with chrootenabled. Since a container does not use its own kernel, OS, or hardware, it requires much less storage and computational resources than a virtual machine. As a result, a host will typically be able to store 2–3x more containers than it would virtual machines, assuming similar workloads. A container will “feel” like a virtual machine but with full access to a host’s storage, memory, and processor. There are benefits and drawbacks to both approaches, and deciding between the two is a matter of knowing the right tool for the job.

chadrob/iStockphoto.com

First, a quick caveat. While Docker is readily available for Mac, Windows, and Linux, it runs native on the latter and but not on Mac and Windows. This is not a trivial difference. On Mac/Windows, you are running your containers on top of a virtual machine, while on Linux, you are running your containers on top of the OS installed on “bare metal”. As a result, containers running on top of Mac/Windows will be heavier and slower than those running on Linux. There has been some effort to decrease the footprint of Docker on Mac/Windows, but for the foreseeable future, Linux is and will be the best place to run containers.

To accurately determine the best tool for a given context, we must first probe the properties of each tool carefully. Given the aforementioned fundamental difference between containers and virtual machines, we can quickly deduce some basic traits of both. The full access to resources granted to containers allows for many containers to run simultaneously on a given host — but this can cause variable performance when too many containers try to utilize resources simultaneously. Virtual machines, on the other hand, pre-allocate a specific and guaranteed amount of RAM, so performance is more stable when there are other virtual machines on a host.

In my work as a systems administrator, I’ve had opportunities to explore both approaches in practice. When allocating finite resources, I tend to classify servers into two categories: servers that must provide persistent and consistent services and data, and servers that don’t. These two categories naturally map themselves to the characteristics of virtual machines and containers respectively.

For large, heavily used services such as code repositories (Gitlab) and automation servers (Jenkins), as well as traditional server roles like email, dns, and file servers, I tend to stick with virtual machines. These servers need to be running constantly and also must provide stable performance. Also, with some exceptions, the load on these servers is consistent and predictable. Running these virtual machines on a stable hypervisor yields reliable performance and consistent behavior.

Specialized, heavily used services are great targets for virtualization.

Conversely, services which are lightweight or repetitive are ripe for containerization. Running Jenkins nodes, Gitlab runners, and other such workers in containers allows us to increase a team’s throughput and quickly scale worker count across different physical servers. Indeed, a primary advantage of a container vs. a virtual machine is the ease and speed with which one can be created and destroyed. It becomes extremely convenient to instantiate a clean environment in which to build and test code, and the cost of adding additional workers is minimal. Keeping this in mind, it is crucial to monitor the overall load on a host, as overcrowding can cause severe performance issues.

Services that need to be quickly and easily replicated can readily benefit from containerization

Another opportunity to leverage the benefits of containers is when testing or building code locally on your laptop. Here, using Docker or LXC will provide speed and simplicity when compared to running Vagrant or hand-built virtual machines. In addition, since containers are merely user-space segregated instances of an existing kernel, the filesystems of containers are folders on the host’s filesystem. As a result, you can directly access and manipulate files on your container without having to worry about sharing folders or using tools like scp or rsync. Using Atom to directly modify files on a container running a web server is, in a word, sublime. In addition, running applications like web browsers in a container provides an additional level of isolation that can protect your computer from malicious entities.

While the above examples make it seem simple to determine the appropriate tool for a given task, reality tends to blur these lines and present cases which are not as easily categorized. For example, while you can run multiple containers with different kernels on the same host, doing so adds a virtualization cost since containers utilize the host’s kernel. Understanding the tools at your disposal will allow you to make better decisions on how to approach problems as well as increase the usefulness and efficiency of your computational resources.

--

--