Unlocking the Power of GPUs in Docker: A Comprehensive Guide

Marco Franzon
Geek Culture
Published in
3 min readApr 9, 2023
Photo by Andrey Matveev on Unsplash

Docker is the most used engine to run our containers, but what happens if our application requires a GPU ? Which the fastest way to have a full GPU’s compatible environment ? Let’s see!

Why configure Docker for using GPUs will be more and more important ?

Docker is the ‘de facto’ tool to create, test and deploy in production the largest part of the current software and web application. Due to the exponential improvement of Machine and Deep Learning technologies, GPUs have a central role in the development phase, but also in the deployment.

For sure, during the design and training of the model, a large number of GPUs is required to parallelize and speed up this initial phase. However, in the production phase, it is fundamental guarantee the maximum of speed, during the inference step. This means that, a correct configuration of Docker with GPUs could be beneficial for initial phases, but even more for your real business.

First step: instance configuration

To run a container with a GPU requirement, you had to be sure to have installed the NVIDIA Container Toolkit.
Setup the package repository and the GPG key:

distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
&& curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
&& curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

Install the nvidia-docker2 package (and dependencies) after updating the package listing:

sudo apt update && sudo apt install -y nvidia-docker2

Restart Docker to apply changes:

sudo systemctl restart docker

Now, to be sure that all works fine, test with a minimal image:

sudo docker run --rm --gpus all nvidia/cuda:11.0.3-base-ubuntu20.04 nvidia-smi

This should return the output of the nvidia-smi command.
For more information, HERE the official documentation.

Second step: learn the most important configuration options

In the docker run command above, we use the --gpu option, passing all as argument. This means that docker can use all the GPUs available. Sometimes you don’t want to use all the GPUs, for example for an unbalance of configurations. Mixing different GPUs, for example with 6 GB VRAM and 12 GB VRAM could lead some unexpected behavior. You can be more specific using different arguments like:

  • device=<GPU-ID> force to use exactly this GPU
  • "device=0,2" force to use just the GPU number 0 and 2, excluding the 1.
  • 'all,capabilities=utility' gives a docker access to all the GPUs available, than
    set as ENV variable capabilities=utility, which add to the container some monitoring tools.

Third step: deploy a real machine learning project

Test our configuration with a real application, using Tensorflow Serving. It is a quick way to wrap your model in a web-server, ready to be used in production.
We want to expose an API REST endpoint on the 8501 port. The Tensorflow serving image requires a mounted volume that contains my model, into /model directory.

docker run --gpus all -p 8501:8501 \
--mount type=bind,\
source=/path/to/my_model/,target=/models/my_model \
-e MODEL_NAME=my_model -t tensorflow/serving:latest-gpu

That’s all folks! Hope that this quick walk-through will be useful for your next machine learning projects!

--

--