Some of my family members and friends tuned in to watch my virtual talk at DockerCon 2020 and I was highly appreciative of it. The discussion I had with most of them afterwards, however, wasn’t about how I over simplified certain concepts or conveyed a wrong idea of something specific in my talk. For the most part they had absolutely no idea what I was talking about, but did walk away with at least one word they clung to, ‘containers’.
I must say, I certainly enjoyed trying to breakdown the concept of containerisation and Docker to each of them. I personally agree with the idea that the test of your understanding of something is often (not always) revealed in how well you can put it in layman’s terms for others. Explaining containerisation to my wife topped them all, and just so it’s clear, she’s a lot smarter than I am.
I have written a post on Wrapping Your Head Around Docker, but thought I would dedicate a new post specifically talking about containerisation and how virtualisation works in relation to containers, whilst contrasting them with Virtual Machines (VMs) because of the common misconceptions around the two.
What Are Containers?
I think the first step in the journey is to think of the physical containers that are typically used to ship goods and services from one location to another. As a technology, containerisation aims to do the same thing. Except in the realm of software, we’re concerned about shipping applications. Why do we need containers for our applications or application delivery? After all, they must serve some purpose. We need containers to solve a consistency or portability problem when it comes to delivering our applications.
Our application is meant to end up in a place where it can be accessed by end users. So what are the options we have to deploy our application onto?
- Physical Machines (Physical Servers)
- VMs (Virtual Machines)
- Cloud VM Instances
The above environments present us with the challenge or problem of portability and compatibility for our applications to run as expected. Ever heard the old, “…but it works on my machine” phrase? Containerisation is a way to ship your machine per se. That way we just need to be concerned about making sure it really does work on your machine 😁.
How do you make your application portable to run anywhere without any compatibility issues? Can it be deployed with a predictable working outcome on Cloud, VM or physical machines and on different OS based computers? The answer is containers.
A container packages your entire application along with everything it needs (all of its dependencies and configurations) to produce a predictable and consistent working outcome regardless of which environment it is deployed on, provided that environment is configured to run containers.
The libcontainer README defines containers as follows: “A container is a self-contained execution environment that shares the kernel of the host system and which is (optionally) isolated from other containers in the system.”
Where Does Virtualisation Come In?
In this context, virtualisation is the process of creating a software entity that functions the same way a computer system would. This process requires a layer of virtualisation software known as a hypervisor (also called a Virtual Machine Monitor). A hypervisor runs on your physical machines and creates and runs VMs. Hypervisors simulate hardware functionality and create Virtual Machines. So you essentially end up with a computer running on your computer. These virtual computers have their own dedicated OS and kernel (I’ll touch on this further down).
So why do people bother with this approach? Well, why buy another separate physical machine when you can have multiple virtual ones running on a single physical machine or server? As a result, you can save on the cost of buying additional IT infrastructure and instead run multiple Operating Systems and applications on several VMs running on a single or a small number of physical servers. It helps solve a problem of infrastructure costs, and furthermore it allows us to setup environments for our applications to run successfully regardless of the host machines OS and configurations. So we can run multiple applications on a single physical server. We can virtualise the hardware, but each VM needs its own OS. This is known as virtualisation at the hardware level.
Is there a downside to this? Yes. As you can imagine, this can be resource heavy on your physical host machine(s) so you will need to consider the following:
- Hardware that can handle the load
- Certain Operating Systems may need licensing
- The cost of setting up enterprise virtualisation technology (i.e. VMWare)
- The cost of maintaining the VMs and their specialised software
VMs virtualise at a hardware level, whereas containers virtualise at an Operating System level. They don’t require a separate OS to run, so this makes them much lighter than VMs. They all share a single underlying Host OS. As a result, you can run multiple containers on a single server and utilise less resources on your host machine. In turn, this brings down computation costs. Also, since they don’t require an OS to boot, they’re startup time is much quicker.
Host OS & The Kernel
The Hardware component in the diagram above depicts the physical machine that will have an OS running on it. The machine’s Host OS (Operating System) will have a core component known as a kernel. The kernel is a program that acts as an intermediary layer to govern or control access between all the programs running on the host machine and its physical components.
Following a container approach to deploying applications means that you can only run containers that are compatible with the underlying OS kernel. So they (containers) are OS specific in order for them to run. For example, unlike hardware virtualisation which we touched on above, Windows applications cannot run inside a Linux container on a Linux host. Windows applications can, however, run inside Windows containers on a Windows host.
Namespaces & Control Groups
Linux Operating Systems differ from others in that they have two distinct features known as namespacing and control groups. Why are these features important? Namespacing isolates resources (hard drive, networking, hostnames, users, etc) on a machine for a particular process. Control Groups (CGroups) are used to limit the amount of resources that a process can use. These two features put together allow us to isolate a single process and limit the amount of resources it can talk to.
The entire component of a running process and the segment of resources it can talk to is what we refer to as a container. So a container is not a physical construct. It is actually a process that has a set of resources specifically assigned to it.
These features of namespacing and control groups are specific to the Linux Operating Systems. So how does the magic happen on a Mac or Windows machine if these are Linux features?
When you install a Container Runtime Environment or Engine like Docker for Windows or Mac, you install a Linux virtual machine or make use of the native virtualisation systems on the machine such as Hyper-V and HyperKit. Inside the virtual machine, is where the containers will be created and hosted. The Linux VM has a Linux Kernel which is responsible for limiting or constraining access to different hardware resources on your machine.
Not At Odds
I should mention that even though I drew up a contrast between containers and VMs when touching on virtualisation, the two don’t necessarily have to be at odds with each other. Depending on what you need, you can run containers on top of VMs both on your local machine, as well as when you deploy them to a VM or a Cloud VM instance.
This might be a lot to digest depending on who you’re explaining it to. I’m sure you can make it more fun by using more relatable analogies that the person you’re explaining it to would be familiar with.
Did my wife understand all of this? Let’s just say she’s still processing it 😅.
If you enjoyed the post, feel free to buy me a coffee here ☕️ 😃 .