Published in

Docker in Machine Learning

Today we are going to see a very interesting topic. In the modern world, AI plays a vital role in every domain. For instance say, the Retail business having a huge role in ML. The big giant like Walmart, Amazon, Flipkart having key features to make the products well reachable to the customers using machine learning.

Well, How is it possible for such a thing to reach the customers? Yeah here is an example, If the ‘X’ person bought a new shirt from the ‘XYZ’ brand. After a month in the market, new shirts are launched with new sort of designs. From the portal came to know those recommendations based on the user purchases. This is an example implemented using Machine Learning. To make it accessible to everyone we need the model to be deployed in a centralized place.

Is Cloud deployment alone necessary?

From the deployment perspective, we need of using any cloud services like AWS, Azure, GCP. So, How the application looks like? Do we straight away deploy from the Jupyter notebook? Those all questions are in place for deployment. We need to pipeline the product. To pipeline need to be automated for every task in the ML project. Right from the dataset to saving the prediction results. So finally we can see it as a product model. That can be deliverable to many customers using any of the cloud services.

Consider the example if the family shifts the house to another location. Their first thought will be packing the household items. Let consider two ways shifting can be done,

Packing individually: Consider the case here, where packing of items placed separately. Need to make sure all the package items are moved to respective movers. Once the family moved to another location, These things can happen ‘Hey!, I forgot where I placed my personal things.’, ‘I kept my working stuff in old house’, ’Finally! I messed up kitchen stuff with another one.’. How to avoid those situations, the second scenario helps you.

Container: Consider the case here, by packing the things placed in a separate container. Finally, we came to know what items are missed. At the end of the day, we will shift happily to another location and easily place the respective items.

Consider the Machine Learning deployment, the product needs to deploy in another customer location. If we move over the individual file to another cloud location. Certain files might be missed. Finally! The customer reported saying ‘Why the application working in one place but not in a new one ?’ Also, there will be a lot of tickets been raised from the customer side. To fix this, we need to containerize the application. This comes to the picture of ‘Docker’ implementation.

What is Docker?

Docker is a tool designed to make it easier to create, deploy and run the application using containers. Containers are used to packaging the application with all dependencies falls in one place.


  • The infrastructure is the physical server that is used to host multiple virtual machines.
  • The Host OS is the base machine such as Linux or Windows. So this layer remains the same.
  • Now comes the new generation which is the Docker engine. This is used to run the operating system which earlier used to be virtual machines as Docker containers.
  • All of the Apps now run as Docker containers.

Docker Installation

There will be certain steps to follow for installation with respect to the operating system.

If the OS is Microsoft Windows 10 Professional or Enterprise 64-bit, or Windows 10 Home 64-bit, visit the below link and download

Once the installation is done, we can see the below screen there will be having Containers, Images & Dev environment.

Significance of Docker File

If the application needs to be dockerized need to create a docker file for the project. Let’s see deeper what actually docker file composed of.


FROM is used to define the base image to start the build process. Every Dockerfile must start with the FROM instruction. The idea behind this is that you need a starting point to build your image.

It means our project requires ubuntu as a parent image.


This command used to set the environment variables that are required to run the project.

ENV sets the environment variables, which can be used in the Dockerfile and any scripts that it calls. These are persistent with the container too and they can be referenced at any time.

We provided HTTP_PORT as an environment variable.


WORKDIR tells Docker that the rest of the commands will be run in the context of the /app folder inside the image.

It will create the app directory in the container.


RUN has 2 forms:

  • RUN <command> (shell form, the command is run in a shell, which by default is /bin/sh -c on Linux or cmd /S /C on Windows)
  • RUN ["executable", "param1", "param2"] (exec form)

The RUN instruction will execute any commands in a new layer on top of the current image and commit the results. The resulting committed image will be used for the next step in the Dockerfile.

The RUN command runs within the container at build time.


ENTRYPOINT has two forms:

  • ENTRYPOINT ["executable", "param1", "param2"] (exec form, preferred)
  • ENTRYPOINT command param1 param2 (shell form)

An ENTRYPOINT allows you to configure a container that will run as an executable.

ENTRYPOINT sets the command and parameters that will be executed first when a container is run. Any command-line arguments passed to docker run <image> will be appended to the ENTRYPOINT command, and will override all elements specified using CMD. For example, docker run <image> bash we will add the command argument bash to the end of the ENTRYPOINT command.

You can override ENTRYPOINT instructions using the docker run --entrypoint

If the ENTRYPOINT isn’t specified, Docker will use /bin/sh -c as the default executor.


The CMD instruction has three forms:

  • CMD ["executable","param1","param2"] (exec form, this is the preferred form)
  • CMD ["param1","param2"] (as default parameters to ENTRYPOINT)
  • CMD command param1 param2 (shell form)

The main purpose of a CMD is to provide defaults when executing a container. These will be executed after the entry point.

In Dockerfiles, you can define CMD defaults that include an executable.


COPY has two forms:

  • COPY <src>... <dest>
  • COPY ["<src>",... "<dest>"] (this form is required for paths containing whitespace)

The COPY command is used to copy one or many local files or folders from source and adds them to the filesystem of the containers at the destination path.

It builds up the image in layers, starting with the parent image, defined using FROM.The Docker instruction WORKDIR defines a working directory for the COPY instructions that follow it.

The <dest> is an absolute path, or a path relative to WORKDIR, into which the source will be copied inside the destination container.


ADD has two forms:

  • ADD <src>... <dest>
  • ADD ["<src>",... "<dest>"]

The ADD command is used to add one or many local files or folders from the source and adds them to the filesystem of the containers at the destination path.

It is Similar to COPY command but it has some additional features:

  • If the source is a local tar archive in a recognized compression format, then it is automatically unpacked as a directory into the Docker image.
  • If the source is a URL, then it will download and copy the file into the destination within the Docker image. However, Docker discourages using ADD for this purpose.


The EXPOSE command informs the Docker that the container listens on the specified network ports at runtime. You can specify whether the port listens on TCP or UDP, and the default is TCP if the protocol is not specified.

But EXPOSE will not allow communication via the defined ports to containers outside of the same network or to the host machine. To allow this to happen you need to publish the ports.

The EXPOSE command does not actually publish the port. To actually publish the port when running the container, use the -p flagon docker run to publish and map one or more ports, or the -P flag to publish all exposed ports and map them to high-order ports.

Zomato price prediction

Let’s see an example of predicting the Zomato price for the restaurants using docker implementation.

The below docker file created for the implementation.

Docker Confirmation

To verify the docker installation, try in command prompt as docker — version

The next step is good to go by building the docker file as below.

Build using Docker file

Open the command prompt and point the directory to the project workspace as below,

Try the command as ‘docker build -t <<name>> .’ Once the command provided the docker generate the build as per DockerFile implementation.

Open the docker application and navigate to images. Verify the ‘zomato-price-predict’ image created.

Run the Docker

Try the below command the name symbolizes a reference to the application. This will be shown in the docker container section.

After executing the command, the app is running.

Verify-in container section as below,

Finally, we can see the application is running

We can see the application logs too.

Hope this article gives you a better idea of implementing docker in Machine Learning.

Hope you enjoyed my article !!



Data Scientists must think like an artist when finding a solution when creating a piece of code. ⚪️ Artists enjoy working on interesting problems, even if there is no obvious answer ⚪️ 🔵 Follow to join our 28K+ Unique DAILY Readers 🟠

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Antony Christopher

Data Science and Machine Learning enthusiast | Software Architect | Full stack developer