GPU-Deep Learning with Docker for noobs

The point of this small tutorial is to make a comprehensible and simple notebook with useful tips and commands to use Docker with NVIDIA GPU for deep learning purposes. It aims to help new developper willing to have a controlled environment for testing and iterating fast.

First things first install your environment on your Ubuntu’s VM. I won’t cover anything else than docker related stuff.

1- Installing Cuda 8.0 for Ubuntu

# From NVIDIA website
wget http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/cuda-repo-ubuntu1604_8.0.44-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu1604_8.0.44-1_amd64.deb
sudo apt-get update
sudo apt-get install cuda

2- Installing Docker & Nvidia-Docker

# Install docker
sudo apt-get install docker-ce
# Install nvidia-docker & nvidia-docker-plugin
wget -P /tmp https://github.com/NVIDIA/nvidia-docker/releases/download/v1.0.1/nvidia-docker_1.0.1-1_amd64.deb
sudo dpkg -i /tmp/nvidia-docker*.deb && rm /tmp/nvidia-docker*.deb
# Getting rid of the sudo by allowing current user to manage containers
sudo usermod -aG nvidia-docker $USER

3- Basics of Docker

Now that you have your environment all setup you need to learn the basics on how to build, run and manage your docker images. Here are a few basic commands:

# List of all Docker images
nvidia-docker images
# List of all Docker container running
nvidia-docker ps
# Copy a file from you machine to a running container
nvidia-docker cp path/to/your/file container-name:/destination
# Monitoring you GPU with nvidia-smi
nvidia-smi
This should be your output for “nvidia-smi”

4- Running your Docker Image

Now I will cover how to build an image from the DockerFile, to run the image and to execute scripts inside of the container. I might add that having a clearer Dockerfile name is a good thing… So when you have your Dockerfile all set, there is still a few steps before being able to use the container. For the sake of this tutorial we will use the official GPU image from Tensorflow.

# Building the container from a file
nvidia-docker build -f Dockerfile -t tag-name path/to/dockerfile
# Run your docker image
# (-d to detach & print container ID; -p to open port for ipython notebook 6006 & tensorboard 8888)
nvidia-docker run -d --name container_name -p 8888:8888 -p 6006:6006 gcr.io/tensorflow/tensorflow:latest-gpu
# Connect to your container
nvidia-docker exec -it container_name bash
# Stop or Start a container
nvidia-docker start || stop

If you want to access your Tensorboard or Notebook, you might want to use port forwarding on your local machine:

# Launch tensorboard on your container
tensorboard --logdir=path/to/log-directory
# Port forwarding 
ssh -L 8888:localhost:8888 -f -N user@yourserver
This should be your output on localhost:8888

This is the end of the part 1 on Docker for GPU-Deep Learning. The next post will be about the Dockerfile and how to setup DeepLearning-ready environment with Tensorflow.

Cheers

If you have any questions or suggestions feel free to ask @fguilloc or in the comment section.