What exactly is Docker?
Discover the container world and expand your software architecture thereby reducing costs and improving quality!
This article explains why we need containers and how Docker can help us to develop reliable applications reducing costs and improving quality.
Time changes, and in technology … time changes fast. I follow container and Docker evolution since the beginning and now, honestly speaking, it’s time to board the train now.
The next chapter is a historical overview to explain why we need to move to containers. If you are yet convinced you can jump to the next section “What is Docker?”
The journey from VM to containers
A VM is a virtualization of a physical server, this works by emulating the hardware. Without entering too much the internals, it is just like an abstraction of physical resources. Your 8 physical CPU can be used to emulate many virtual CPUs that are used by the virtual machine.
This is quite easy to understand and usual, nowaday. It could sound strange, but it was revolutionary many years ago.
I’m not speaking about the paleolithic, just twenty years ago! Ok, maybe twenty years ago may sound really like the paleolithic considering how much changes there were in the IT world 😅. When I started to work as a developer, many “old-style” customer was diffident about the new virtualization technology. That’s what I feel now speaking about containers, but let me go on on this storytelling.
Coming from a physical world where the server was made by devices, cables, screws, etc.. the virtualization of hardware was the most intuitive thing to do to resolve most problems related to the iron and made a big leap forward.
Virtualized hardware was a small revolution and allowed resource optimization, maybe with a small resource overbooking. In a few years, with a small investment, everybody was able to own its datacenter buying a server and installing a hypervisor on it.
The benefit of software development was high. Since this step, create a new environment was easier and cheaper, so we avoid the situation where multiple services were installed into a single server and we moved to a single responsibility policy.
We moved from the environment composed by one or two servers (all in one or database+webserver) to complex architecture. Today, it’s easy to reach a relevant number of services in a web application. Database, API Backend, SPA Frontend, caching server, full-text index, log collecting… having ten services is not so unusual. If you multiply this for each environment (quality, test, integration…) you can just imagine the impact on cost. Of course, you can simplify by aggregate services into the same server on the test environment, but what about repeatability?
So, virtual machine it’s a good solution, but has some limit for the nowaday’s requirements. Now we need more flexibility and a more granular distribution of services!
We’d like to find a smarter solution, more flexible, that could grant the logical separation of roles (one “server” for each service) and keep the cost contained. When I tell “cost” I didn’t limit to the hardware or hosting cost, but also the cost of people who install, configure and maintain the system.
We will see in detail what a container is in the next part of this article, but no hurry, for now, think to a container like a small virtual machine that hosts only one thing and can be easily created and destroyed. Imagine also that this “virtual machine” could consume resources only when it needs. Cool?
I think so, or almost, it open to many very useful scenarios.
Reading what there is until now, it’s clear how Docker containers do the same things as virtual machines, but more smartly.
What is to know now is what containers do more!
How it is possible?
Thanks to the virtualization of the OS (and not the hardware) the container consumes only the resources that use. Moreover, the optimized management of disk that consumes only the space of file that differs from the original image. For example, if you have ten MySQL databases, you consume only once the space for the database engine so the cost in terms of disk space for each container is only the data.
So, the main advantages of containers are:
- Same production configuration
- Optimized resource usage
It sounds interesting, ready to discover Docker?
What is Docker?
Docker is a solution that makes possible the containerization of applications. This solution works on Linux, Windows, and MAC as well. It can be installed for free, even in your local machine. It is based on a hypervisor to provider non-native virtualization. For example in windows through Hyper-V, docker uses an internal Linux VM to provide Linux containers. This means every part of your source code will be portable and many technical issues will be solved into the software development process.
To make Docker easy to understand, coming back to the old virtual machine. Let’s make a comparison about how old things work in a new way.
The virtual machines have:
- File system
- The OS, (of course, basically files inside the file system😅)
- Resources (memory, CPU, disk)
- Some network interface
If a real SysAmin GUI read this he will probably put his hands in horror. Of course in a virtual machine there are a lot more than this, but please forgive me and focus only on that.
In my honest and poor opinion, what Docker does in the file system is simply clever. The principle is that: we use layer and each layer contains only the increment from the previous one. Layer with the same data is stored only once, so this helps to save space. For example, if you want to deploy a WordPress web server, you can take the WordPress image from the docker hub and add a layer with your customization. This means the cost of OS and WordPress core is paid once, only customizations will need additional layer and space. So, considering 100 MB of WordPress customization over 10G of VM footprint you will consume 101GB versus 11GB, saving quite the 90% of disk space! Another good thing is that the base image, WordPress image, is linked and not copied on the container. So, if you upgrade the base image all child containers can be rebuilt easily and inherit changes.
Many parts of the disk can be linked to your host machine. This allows you to map a single file of your PC to the container or use a full folder to share files. Think to it like network folders that make it possible sharing contents between containers inside Docker or between docker and the host machine.
I will do another thing that could turn SystAdmin’s nose up.
The OS is barely a big amount of files on a disk, no more.
So, all I told until now for the file system can be extended to OS too. The consequences are that is easy to change Linux distribution or optimize it for a specific purpose. For example, you can start using a Debian image, that comes with more tools, and then migrate later your container to an Alpine distribution to reduce the container footprint without split hairs.
I wrote so many times that resources are shared and this will optimize optimization that my keyboard refuses to do it another time 😄 This was the last time, I promised.
Network interface, like the disk, is virtual. This means each container gets a virtual IP address that is valid inside the virtual network managed by docker. Of course, you can have multiple virtual networks to isolate a group of containers. You can also do complex things, like into real networks, but what you really need to start is that you have your container communicating together by DNS (internally to the network the name of the container resolves to the virtual IP) and if you want to interact with server Docker can be set to NAT the port of the container to the port of the host. For example, in the old WordPress case, you can expose port 80 of the Docker web server container to the 8080 of the host machine. So, you can access your WordPress website hitting http://localhost:8080. The traffic to 8080 will be forwarded to the port 80 of the container and you will consume the web site as it was into your local server. By the way, the webserver will be able to contact MySQL server on internal address, for example, connecting to mysql:3306 assuming the docker container for MySQL is named “MySQL”.
What we have more?
Everything until now is great but not enough. Until now we spoke only about the feature that replaces or enhances what we already dit with regular VMs. But there is more. The big acceleration using Docker is from the features they added over this.
Here my shortlist:
- Docker Hub: is a repository of container images. You can find here a lot of pre-built environments maintained by the producers. This speeds up a lot because you don’t need to install and configure anything, just download and use. Using and uploading images to the Docker hub is free. You can also have your private repository. Other than Docker there is a lot of service and options (one of them is the Microsoft Azure Container Registry used on my tutorial on Kubernetes)
- Environment variables: containers help to make images reusable, but this requires to keep all parametric. The environment option allows passing settings to the container from outside. Every parameter sets a variable on environment settings inside the container. For example, always in the WordPress example, you can specify the database settings.
What to take home
Containers are not a replacement of VM but in most scenarios can help to provide a smarter solution, improving quality and reducing times.
From a SysAdmin point of view quite everithing that we can do with container can also be done using regular VM, but with more complexity and skill requirement.
The benefits of containers are:
- Faster time to market
- Increase productivity
- Reduces IT infrastructure complexity and costs
- Faster issue resolution
On the other side, using containers bring such improvements only if all development process is structured in the right way. This opens to greater DevOps competence rather than Sysadmin, and more consciousness of developers about the infrastructure. This new way to build software ask the developers to be a little bit more architect but produces all the result described above.
For the legacy application, it’s hard to say if moving then to containers can bring value. It has to be evaluated case by case. Maybe it will be convenient to containerize an application or not, depends.
In any case, for new application, container must be an option.
What else? Containers are cool, but just the first step to the cloud. This will opens to new challenges, like container orchestration or infrastructure deployment. But this is part of the game, there is no end to learning.
Found this article useful? Follow me (Daniele Fontani) on Medium and check out my most popular articles below! Please 👏 this article to share it!
- What is exactly Kubernetes?
- How to deploy a web application with Kubernetes
- TDD (Test driven development) explained by examples