CUDA + Docker = ❤️ for Deep Learning

Aditya Thiruvengadam
3 min readMay 7, 2019

--

This post assumes you are familiar with Docker and have CUDA on your host system (personal computer/cloud instance)

Setting up the necessary packages, frameworks, and drivers for your deep learning research work can be time-consuming and tedious. Helping your colleague replicate your complete installation set-up can also be extremely taxing and is generally very prone to errors.

Thus, setting up Docker for your deep learning project is the easiest way to create reproducible research.

For deep learning especially, you’d want to leverage on the costly GPU you have purchased for your cutting edge research. But, how would you access your GPU inside a Docker container?

Checking the Prerequisites

This post assumes you have docker, docker-compose, and Nvidia CUDA installed on your host system. It also assumes you are familiar with the best practices of docker and are using it in your research work already.

To check if you have a working CUDA installation:

Terminal: $ nvidia-smi

Install nvidia-docker:

nvidia-docker is the NVIDIA Container Runtime for Docker. It helps in building and running Docker containers leveraging NVIDIA GPUs connected to the host by using the hosts’ CUDA framework.

Follow the instructions given in this link here for your corresponding OS to set up nvidia-docker2 on your host system.

To check if you have nvidia-docker2 running on an Ubuntu 16.04 system:

Terminal: $ nvidia-docker run docker.io/nvidia/cuda:9.0-base-ubuntu16.04 nvidia-smi

Running this should give you the same result as you got above when running nvidia-smi on your host.

Using CUDA in your Docker Container

I’m hoping you are using a docker-compose file to build and run your containers. It is remarkably cleaner than running docker commands with flags all over the place.

Setting NVIDIA runtime:

version: '2.3'
services:
object_detection:
build:
context: object_detection/
ports:
- "8000:8000"
runtime: nvidia
volumes:
- ./object_detection:/object_detection
links:
- fluent-bit
- redis
- nginx
restart: on-failure

My docker-compose.yml file looks like the above. If you have a docker-compose.yml version syntax ≥ 2.3, you can use the runtime: nvidia flag (marked in bold above) in the required container’s service section (here: object_detection) of your docker-compose.yml file.

Hence, now while building the docker containers, Docker on seeing the line — runtime: nvidia will use the NVIDIA runtime which is provided by nvidia-docker2

NOTE:

  • If you are using a docker-compose.yml syntax ≤ 2.3, please update the syntax to version a version ≥ 2.3 (it's as simple as changing the version on the docker-compose.yml file and following its syntax).
  • If for some reason you aren’t able to switch to a version ≥ 2.3 in docker-compose, you can set the default runtime for all docker containers in : /etc/docker/docker.json by adding the line marked as bold below:
{
"default-runtime": "nvidia",
"runtimes": {
"nvidia": {
"path": "nvidia-container-runtime",
"runtimeArgs": []
}
}
}
  • This way, all containers will choose the NVIDIA runtime provided by nvidia-docker2 by default.

Mounting your host CUDA path:

In the volumes section of your service in the docker-compose.yml file, add the line marked in bold:

version: '2.3'
services:
object_detection:
build:
context: object_detection/
ports:
- "8000:8000"
runtime: nvidia
volumes:
- ./object_detection:/object_detection
- /usr/local/cuda:/usr/local/cuda
links:
- fluent-bit
- redis
- nginx
restart: on-failure

This will mount your host CUDA installation onto your container as a volume with mount point : /usr/local/cuda

Updating your Dockerfile:

In the Dockerfile corresponding to your service above, add the following line in bold:

FROM adityathiru/object_detection_base_image:1.0ENV PATH /usr/local/cuda/bin/:$PATH
ENV LD_LIBRARY_PATH /usr/local/cuda/lib:/usr/local/cuda/lib64
ENV NVIDIA_VISIBLE_DEVICES all
ENV NVIDIA_DRIVER_CAPABILITIES compute,utility
LABEL com.nvidia.volumes.needed="nvidia_driver"


... (your other lines in the Dockerfile)

In the line ENV LD_LIBRARY_PATH set the host mount path to whatever $ echo $LD_LIBRARY_PATH gives on your host system. For example, if it turns out to be /usr/loca/cuda/lib64, that line gets updated as:

ENV LD_LIBRARY_PATH /usr/local/cuda/lib64:/usr/local/cuda/lib64

These lines initialize the necessary environment variables and other configurations required by the NVIDIA runtime inside the Docker container.

Testing if everything works

Now, all you have to do is build and bring up your Docker containers, get into them using docker exec -it container_name and run:

Terminal: $ nvidia-smi

If you get the same result as your host container, you are set! :)

--

--