A Guide to Docker for Machine Learning Written for Clever Beginners

8 min readJul 10, 2019

Setting up and running a neural network in an optimal manner on remote machines to ensure stable running and also enable further easy usage of the tools, hyperparameter optimization and testing of various settings, deployable on any machine .

For any of you who is interesting on running python code in an efficient manner using external machines. I always prefer using a secondary machine for training of neural networks for the following reasons :

1) current usage of memory from your local machine does not interfere with the computations

2) the remote machine can be used on demand by several people

3) being able to switch easily between different library and python versions without crushing the main system configuration

For anyone who has not used docker before, docker offers shippable solutions, that can run indecent on the host systems but still guarantee a proper development environment and libraries that would otherwise be not possible to maintain on one system together.

( In the following, I will explain docker commands, but things can be similarly applied to nvidia-docker, just make sure the base image is GPU compatible)

Recommended set up for a smooth procedure :

Image to run the code on, including necessary requirements and compatible python libraries
Container starting commands that enable very good usability

For a more advanced setting, I also recommend looking at docker compose and YML files as well. If you understand the topics discussed in this article, you will be easily able to switch to docker compose.

Usually, a docker container is useful for deploying a Jupiter notebook server to do coding and interact with the code live or to remotely train a neural network.

Docker files and images

A generic set up for a Tensorflow docker image file for example is:

FROM tensorflow/tensorflow:latest
RUN apt-get update && apt-get install -y python3-pip git nano
WORKDIR /home/code
ADD localfolder dockerfolder
RUN git clone http…

Start a dockerfile i.e. image creation by naming a file “Dockerfile”, by running for example

touch Dockerfile

Any type of docker command in that file is written in all capital letters. The most often used commands are the once mentioned above. You start by choosing a basis image, you can look up existing images on docker hub and can choose technically anything from Ubuntu raw versions to Tensorflow ( running on Ubuntu) or Alpine.

You will need to install minimal things you might consider having in your image. Before doing so in an Ubuntu based system, you have to run a system update first and then continue to specify the packages you want to install with a yes flag to ensure it is installed while you cannot press enter in the process.

You can change the directory where you work in with the workdir command. If the location does not exist, it will be automatically created.

You can add local files to the image through the add command. I recommend putting necessary files into the same directory as the docker file and nothing more than that, as all files in the Dockerfile folder will be added to the build path. Anyhow, I will also discuss how to mount volumes later on, as this saves space and facilitates access, I only recommend adding files that you do not intend to change and also are not private.

In the following lines, you will be having additional set ups to run through for the code that you are working on, such as cloning from github etc. Nearly any command can be run with RUN. (the only non trivial example is setting up CUDA drivers by yourself and that is not recommended for beginners, CUDA also has image repositories available, that are typically well build)

The following commands do not work in docker images:

commands with sudo, you are already admin as such work without the term “sudo”
Some c++ compilations that require heavy system modifications will only be active after the completion of the image and will throw not found errors even though found.

To create the Image, go to the local Dockerfile repository and run

docker image build -t image-name .

Docker will process the commands inside the file line by line. In case of errors, the process will be aborted at the corresponding step x. If you modify that erroneous line but keep the rest the same, the image creation will continue by using the buffered steps from before.

In other words: docker will save the last working step as an image, make sure to check those as they might pile up to a lot of space. The commands

docker images

And

docker images -a

Will reveal the image seizes. In the latter one, you can see any partially saved image as well. It makes a lot of sense to clean up by forcing image removal and running through the image creation for the finally working file again, if you had several break ups in between ( that also happens when the internet connection is not stable)

External images, pulling and pushing

You can download existing images you found on docker hub with

docker pull repository/image-name:version-tag

Similarly, you can also consider getting yourself an account and uploading. (i.e. pushing) your own images, that you can pull to any device later on.

The layers that you can see being downloaded are the same as the ones that you can see when creating a Dockerfile. There are a bunch of tricks for optimizing docker images in size and format, but I recommend looking at other public Dockerfiles first. Very well implemented github projects typically include those and you do not need to star from scratch.

Running a container

In order to use an image to run a container, there will be a bunch of tools simplifying your work process. First of all, denote that the straight forward docker run command will always create a new container. Typically you will also want to use the same container again, which then will be managed through docker start or docker exec commands.

docker run -it image-name bash

This will typically run your image and open up console line editing. You can now enter further commands to be done solely inside the container. The it flag ensure that the container is online as long as you do not stop it with the command “exit”.

The other very useful setting is to run commands directly with the start of the container, for example:

Docker run -it image-name bash -c “cd /home/code; jupyter notebook”

This will run the image and start the Jupiter notebook. The -i and -t flags enables an interactive mode, in that case you will see the Jupiter notebook starting and receive further details. This will allow you to use the container like a virtual machine as long as the bash command is running

Jupyter notebook can be easily run inside docker and connected to your local host browser by adding port connections between the docker container and the machine, with the command

Docker run -it -p 8888:8888 image-name jupyter notebook

This will connect your local port 8888 to the docker container port 8888. This is the standard used for jupyter, but if you change the jupyter settings and the ports in that command you can take basically any unoccupied port. ( using common ports is only recommendable in safe environments)

Other than that, adding volumes to the container is very useful for further editing of files. While you can technically copy files over from host to container and vice versa, volumes allow you to use the files from both sides at the same time and any side can modify those. Typically I add data sets, code I modify a lot and non public files:

Docker run -v location/on/host:/location/in/container image-name

Make sure to enter the correct host path as the volume command will also create new folders if non existent. You can add as many volumes as you like, however the volume at the container location will overwrite existing files if you put anything else in there.

You can find a complete list of attributes for docker run in the official documentation.

Administrating containers

So by now, you already now a lot of tools and how to run a container. If not in usage, you can also stop the container . If it has not crashed, you can restart it anytime and will find it in the same condition as before.

Get used to using the commands „docker stop container-nr“ , „docker Start container-nr“ and „docker exec container-nr command“. You can start and stop containers any time, and you can executes specific commands on running containers with exec. Note that that requires that you left sufficient computational power for whatever task is already running. You can adjust cache and memory settings, please refer to other sources for details on how to change general docker settings.

You can find running containers through the command

docker ps

And a summary of all containers with

docker ps -a

Remove containers that crashed or that you do not want to use anymore with

docker rm container-name

And similarly unused images with

docker rmi image-name

You cannot remove an image that is used in a running container, unless forced with the -f flag.

If you run an image with volumes and commands, restarting the corresponding container will invoke the same settings. Similarly, you cannot set the volume later on, but will need to start a new container, if you changed those settings.

Docker compose files offer a great variety of starting settings and crash handling, however I advise to check for the options you require as there are various compose file versions with different capabilities. This also concerns container based memory settings.

Docker and CUDA

Specifically if you are required to use CUDA in combination with docker, I.e. nvidia-docker, I recommend using Ubuntu and images that already contain the proper CUDA settings. Installing CUDA and CuDNN on images will require dedication. Even though you can technically put any combination of CUDA and CuDNN and libraries on your docker image, make sure that the versions are compatible with your hardware and local drivers.

If you are already using NVIDIA-docker for network training, you can also check my previous post on how to run optimization efficiently and automatically.

Docker images and container for public usage

If you intend to use the same docker image or containers for a team of people or even share them online, there is the following additional features you should consider using.

Creation of a non admin user with pwd in the image (more info)

RUN useradd -ms /bin/bash new-user-name
User new-user-name

Removal of data inside container / access through volumes

Useful links and resources

Existing images on docker hub
Docker official documentation
Docker projects on GitHub
Stackoverflow for any docker related question