Beginning with Docker

basic concepts & terminologies

Published in

Data Science in your pocket

6 min readJun 1, 2021

Drifting from the world of models & algorithms for some time, I will be deep diving into exploring Docker this time in a series of blogs.

But before moving on to Docker, let’s assume a real-world scenario:

You are a developer who just coded out a new application/service on your local environment. The app looks great & works fine on your local environment & the next phase is to take this code to Staging environment & the finally to Production.
Once you move your code to Staging,

The code breaks down.

You must have heard this line a million times from devs:

‘But this worked on my system, don’t no why failing here’

Do you know why it breakdown on Staging but not on your local environment? The case is mostly the dependencies required for your application were different in the two locations. Like you had Python 3.8 on your system & hence the application was designed using version 3.8 but on Staging, it is 3.6 & hence a few features missing leading to the mishap.

Can this be avoided? Like I create an app/service, test it just once in a lifetime (in any environment) & deploy it anywhere regardless of the environment. One of the major advantages we will get is once a feature is tested anywhere, can be deployed on production servers without worries hence saving a big chunk of bandwidth in testing in different environments. Though, apart from testing, such a tool can be helpful in many ways.

Docker does something what we wished to have in the above discussion.

Going with the dictionary meaning of Dock, it says:

an area of water in a port where goods are put onto and taken off ships, or ships are repaired

A few points to note from the definition & the Cover photo/Docker’s logo:

What type of goods? any !!
How? using huge containers (On your left)
Where to deliver? any other Dock irrespective of the state or country

Our Docker also takes inspiration from the original Dock’s concept where:

Goods → any type of service/application using any language be it Python or Java or Go
Metallic cuboidal containers with goods → standalone package encapsulating service/applications and required dependencies to be transported together as a single, deployable unit
Docks → any environment like Staging, Production, your friend's PC, etc. where the service/application can be deployed

So, in short, Docker helps us to containerize our apps/services & then these containers can be shipped to other environments & deployed on the go. Now, these containers can contain Java services, Python web services, databases, or what not.

So, as we now know what Docker is, let’s walk through a few key concepts very quickly:

Images: They are the building blocks of Docker which incorporate the recipe to create a Docker Container. So, if we wish to develop a web app using Flask, an Image will contain a set of instructions to incorporate Python, Flask, the environment variable, commands required to start the app, app code, etc.

Also, An image can be developed using another image as a base. For example: For the above app, a pre-existing image with Python can be used as a base over which instructions to add Flask, app code & other remaining requisites can be added creating a new, custom image. Docker hub has many pre-existing images that can be downloaded & used easily

Containers: A running instance of any Image on any environment is called a container i.e. So, if you created an Image to run Flask based web app when this Image is deployed in any environment, the running instance is called Container. We can run multiple containers using the same Image on the same or different environments.
Docker Server & Client: Docker Server is a daemon process (A process running as a background process in an environment once you install Docker) that takes commands from Docker Client through CLI or Docker app. Docker Client is a way using which developers pass commands to Docker Server to run a container, create an image, remove containers, etc.
Dockerfile: This is the file doing all the magic. It incorporates the instructions required to build an image. The name of the file remains the same irrespective of the service/app you wish to build. Notice no extension is being used
Registries: You may wish to store the images you build somewhere. Registries are the place you are looking for which can be both public or private.

So the entire process may look something like this:

You code your app (like you usually do on VS Editor, PyCharm, etc.)
Develop a Dockerfile in the root folder of the app you coded.
Docker Client gives command to Docker Server to build Image using instructions in the Dockerfile
Once Image is build, it is ready to be deployed & Docker Client gives Docker Server the command to run a container to see the live version of your Image
If you wish, push it to a public/private registry

And voila !!

you are now running a containerized service/app on your system which can be deployed with ease in say your friend's PC, Staging environment, and Production behaving similarly everywhere

This is all good but

Why should one use Docker?

It's a tracer bullet as a lightweight in nature. For a surprise, An app on your local may take seconds to launch on any environment
Devs can be more focused on the logical problems rather than the environmental issues for their app. So no more: “But this worked on my system, don’t know why failing here’’ kinda situation
Testing of any application becomes very fast.
As it recommends (but has no hard rules) running a single process in a single container, helps in promoting micro-services architecture which can be great when comes to debugging or scaling up. So if you need to run a backend connecting to some DB, Docker suggests keeping the two services separate in different containers (but can be kept together as well) making managing services easy.

Now, one question I would specifically love to answer before wrapping up

How is Docker different from Virtual Machine(s)?

To answer this, we need to touch a bit on the System memory of a computer divided into 2 major sections:

User space: Space where normal programs/processes run.
Kernel space: Space where OS’s kernel runs managing the processes running in User Space.

Docker uses container-based technology & a deployed ‘Container’ with any specifications acts more as a process/set of processes running on the system(like any other process running) taking up place only in the User Space but VMs aren’t following any containerization tech hence take up space in both User & Kernel space hence bulkier & with a lot more features. So, in short, Each instance of a VM has a separate OS running alongside the parent OS (OS that initiated the VM instance) but Docker Containers don’t & hence are lightweight.

For any practical stuff, I found this repo to be on spot: https://github.com/docker/labs/tree/master/beginner

Thanks, Nancy Gupta for the assistance.

For my previous Blog series:

Transformers in NLP(4 parts)
Dimension Reduction (3 parts)
NLP Algorithms(7 parts)
Reinforcement Learning Basics (5 parts)
Starting off with Time-Series (7 parts)
Object Detection using YOLO (3 parts)
Tensorflow for beginners (concepts + Examples) (4 parts)
Statistics for beginners (4 parts)