How To Run Docker Images and Make Your Own! Docker for Data Science/Python

Aaron Abrahamson
Analytics Vidhya
Published in
7 min readDec 19, 2019

This is a followup to my previous post, What is Docker, and why is it useful for data science? In this post I am going to show you how to setup Docker on your machine, and create a Docker Image from a Dockerfile, and then how to get a Docker Container running PostgreSQL up and running.

Installing Docker

Here is a link to download Docker Desktop. You will need to register to freely download the file. Once downloaded, double click on the Docker.dmg file in your downloads folder, and drag the cute whale icon into your Applications folder.

You’ll know Docker is running when you see the whale icon appear on your Menu Bar (I have added the red box around the icon for visibility)

Now it’s time to see if everything is working properly. Open up your terminal and enter the following command:

docker run hello-world        # The syntax for running images is 
# docker run <image-name>

You should see a message pop up — congrats, you just ran your first Docker Image! To summarize the text you see displayed:

  • Docker attempted to look for a local copy of this image (and failed)
  • It pulled an image of ‘hello-world’ from Docker Hub (a repository of all kinds of Docker Images)
  • It created a container from that image, which included an executable command that produced the text that appeared on your terminal

Remember the distinction between images and Containers: images are created from Dockerfiles and downloaded from Docker Hub, but they are not ‘running’ on your machine. Docker Containers are instances of images that are actively running (or are briefly ran to run an executable). Here are some other useful commands:

docker help             #brings up the full help menu. There's A LOT
docker <command> --help #this displays help for a certain command
docker images #shows all images stored on your machine
docker rmi <image name> #removes an image (add -f to force it)
docker ps #shows all running containers
docker run <image name> #create a container based upon an image

Creating Your Own Dockerfile

The method in which you encapsulate your work/code into a Docker Image to share is with a Dockerfile. The Dockerfile is a text document that contains commands that Docker will read to assemble the image. I am going to show you how to make a very basic image and then explore more complicated features down the line. But if you’d like to read ahead: Docker has extensive documentation on their Dockerfile commands. Here are the steps:

  1. Make a directory to house your work. I made one called docker right in my Home directory
  2. Use your favorite IDE to write a short Python script. And save it in your Docker directory. I did something like:
print(f'5 to the 100th power is {5**100}')

3. Open up another text document and save it as ‘Dockerfile’ (just Dockerfile). I use VSCode and it has a nifty Docker plugin. Enter the following commands:

FROM python            # Use the latest image of PythonRUN mkdir -p /docker   # Make a directory in the imageWORKDIR /docker        # Makes that directory the working directoryCOPY ./ /docker        # Copy local files to that directoryCMD python hello.py    # When the image is ran, run this command
  • FROM: This initializes building the image by setting a “Base Image.” You can use any valid image here, but let’s use the latest Python for this image. You HAVE to supply this command to make a new image. You can specify a version by adding a bit on the end. FROM python:3.7 will use the Python 3.7 image.
  • RUN: This issues a command to make a directory in the image/container, where we will then work from.
  • WORKDIR: This sets the working directory for any subsequent RUN/CMD/ENTRYPOINT/COPY/ADD commands. We are setting the working directory to the directory we just made in the RUN command
  • COPY: There are two arguments here. The first, ./ means it is copying all files in the directory that the Dockerfile is located in. These files are being copied into the /docker directory we created. Since I’m copying just one file, I could list that instead of the ./ command.
  • CMD: This issues a command when you run the image. For this, we are telling it to run the Python script ‘hello.py.’ Since it is in our current working directory, we can just list the filename.

Building a Docker Image from your Dockerfile

Navigate to the directory with your Dockerfile and Python script

Now to build a Docker Image! To quickly recap, you should have a Python script and a Dockerfile in the same directory, and in your terminal you should be in that directory. Now for the good stuff! Type the following into your terminal to turn this all into an image (known as building a Docker Image).

docker build . -t docker-test   # This is saying to build all (.),
# as well as to tag it (-t) with the
# string proceeding, using the
# Dockerfile in my directory

Hit enter! It should spit out a bunch of information back at you. If you look at each step, it’s running through all of the commands that we placed in the Dockerfile. First it pulls Python from Docker Hub, creates a directory inside of the image, designates that directory as the working directory, copies all files from the current directory (on your local machine) to the directory on the image. It stores the CMD line, which is only ran when the container starts.

Output from building a Docker Image

Congrats — you just made your first Docker image! You can type docker images to see it listed.

Now to run it by typing this in the below command line:

docker run <image-name>  # This is how you run any image in Docker
docker run docker-test # I named my image docker-test earlier

It will now run the CMD we put in the Dockerfile: python hello.py and run our script:

Wow, that’s a big number!

A little bit on the CMD Instruction

There is another instruction that if you were to replace CMD with, you wouldn’t tell the difference from what we have done so far. It’s ENTRYPOINT. CMD is a command that is ran when the container is started, but if you were to issue other commands at the same time in the command line argument (more on this in a moment) it would not run. ENTRYPOINT makes the container get treated like an executable and it will not listen to the same added commands, instead just executing the command provided with it. Try the below code, and be sure to add the -it and bash before/after the container name. This just means we are going to access the container’s terminal directly. You’ll see that it ignores the CMD instruction, and does not run hello.py.

The CMD function is ignored, and we enter the container’s terminal directly
The files we copied into the Docker Image

Try typing ls to see what is inside the container. You will see we are in the /docker directory we specified in the Dockerfile, and it contains the files we copied into it.

Neat, right?

PostgreSQL

This will be a very quick crash course into getting a PostgreSQL container running, and how to access it (Note: I’m still in the process of learning this all myself). In later posts I will explore this aspect more thoroughly.

Here is a link to the official PostgreSQL Docker Hub page.

To start an instance is quite simple, just try the following command in your command line:

docker run --name posg -e POSTGRES_PASSWORD=bestpw -d postgres

You should now have a running Docker Container (in our previous example it ran once and then closed) with PostgreSQL on it! Try docker ps in your command line to see running containers, yours should be there now.

To access the command line of the container itself, type in the following:

docker exec -it posg bash # 'posg' is what we named the container in
# the above command. We are using
# 'exec' specifically to access a running
# container's terminal

Now you are inside this container! To open up PostgreSQL, type the following:

psql -U postgres          # access psql as the user postgres

You’re in! There’s no data in here, but I will be covering that going forward. My goal is to describe how to setup PostgreSQL databases to host your data, and how to get a Jupyter Notebook up and running with the right dependancies you need for a given project.

Next Steps

  • Try editing your original Dockerfile, replacing the CMD instruction with ENTRYPOINT and follow the above steps. You’ll see that it won’t let you access the bash terminal of the container, and instead executes the command you have listed.
  • Try creating a table in your container’s PostgreSQL instance. Then — Exit the PostgreSQL container, stop the container (docker stop container), start it back up (docker start container), and look for your table. It’s still there!
  • Sorry this post is so monotone!

--

--