Docker with GPU Support in WSL2

Darren Gibbard
5 min readAug 4, 2020

--

This is just a quick-ramp guide to getting up and running in WSL2 with GPU (machine-learning, rather than GUI stuff) support.

Tip: if you want GUI support, maybe take a look here: https://www.youtube.com/watch?v=IL7Jd9rjgrM

Once you’ve run through this guide, you’ll have docker running inside WSL2, with the ability to run CUDA workloads, but without the need for Docker for Desktop (which currently doesn’t support the nvidia-docker runtime!)

Prerequisites:
* A CUDA compatible graphics card! (I’m using an RTX2070 for this example)
* Windows 10 — Build 20145 or newer (check with Run> “winver”)
* CUDA driver installed from: https://developer.nvidia.com/cuda/wsl
* WSL Kernel version “4.19.121-microsoft-standard” or newer (to update, check Settings> Windows Update)
* Ubuntu 20.04 WSL2 instance (Grab it from the Microsoft Store)
* Ideally, uninstall Docker Desktop, and any existing docker packages if you have it installed already.

TIP: Don’t run through this guide as root. Run commands as your own WSL2 user :)

Firstly, install docker directly in WSL2:

curl https://get.docker.com | sudo sh

Add the Apt repos for the NVIDIA docker runtime + components:

# Capture your current distribution version in a variable:
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
# Add the NVIDIA repo GPG key to apt
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
# Add the NVIDIA repo to apt repo list
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
# Add the experimental libnvidia-container repo to apt too
curl -s -L https://nvidia.github.io/libnvidia-container/experimental/$distribution/libnvidia-container-experimental.list | sudo tee /etc/apt/sources.list.d/libnvidia-container-experimental.list

Amend the Apt repo configs (I found that even though we’re setting the distro version, it would create these hardcoded as ’18.04’ instead of ’20.04’):

# Replace occurances of '18' with '20' so we end up with '20.04' instead:sudo sed -i 's/18/20/g' /etc/apt/sources.list.d/nvidia-docker.list
sudo sed -i 's/18/20/g' /etc/apt/sources.list.d/libnvidia-container-experimental.list

Now we can install the NVIDIA docker runtime:

sudo apt-get update
sudo apt-get install -y nvidia-docker2

Add your user to the docker group so we can use docker without needing to be root/sudo:

# This assumes you're running as the user you want to add to the docker group :)sudo usermod -aG docker $USER

Add your user to the sudoers file, so that your user is allowed to start the docker service without needing to provide your password:

# Allow managing the service without password
echo "$USER ALL=NOPASSWD:/usr/sbin/service docker *" | (sudo su -c 'EDITOR="tee" visudo -f /etc/sudoers.d/docker-service')
# Allow creating the cgroup/systemd dir without password
echo "$USER ALL=NOPASSWD:/usr/bin/mkdir /sys/fs/cgroup/systemd" | (sudo su -c 'EDITOR="tee" visudo -f /etc/sudoers.d/docker-mkdir')
# Allow mounting the cgroup/systemd mount without password
echo "$USER ALL=NOPASSWD:/usr/bin/mount -t cgroup -o none\,name=systemd cgroup /sys/fs/cgroup/systemd" | (sudo su -c 'EDITOR="tee" visudo -f /etc/sudoers.d/docker-mount')

Add a startup script to your ~/.bashrc file, so that docker gets started automatically when opening WSL2:

cat << EOF >> ~/.bashrc
## Start docker if not already running
if [ "x\$(pgrep dockerd)" == "x" ]; then
echo "Starting Docker..."
sudo service docker start >/dev/null
# Fix/mount systemd/cgroups too
sudo mkdir /sys/fs/cgroup/systemd
sudo mount -t cgroup -o none,name=systemd cgroup /sys/fs/cgroup/systemd
fi
EOF

Now is a good time to restart your terminal or Windows (better yet, you can restart just WSL by restarting the Service called ‘LxssManager’)
Then, when you open a new terminal you should:
* Not see a sudo password prompt!
* Have a working docker install :)

Test your new GPU-accelerated Docker deployment with:

$ docker ps
CONTAINER ID IMAGE COMMAND CREATED
STATUS PORTS NAMES
$ docker run hello-world
Hello from Docker!
This message shows that your installation appears to be working correctly.
To generate this message, Docker took the following steps:
1. The Docker client contacted the Docker daemon.
2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
(amd64)
3. The Docker daemon created a new container from that image which runs the
executable that produces the output you are currently reading.
4. The Docker daemon streamed that output to the Docker client, which sent it
to your terminal.
To try something more ambitious, you can run an Ubuntu container with:
$ docker run -it ubuntu bash
Share images, automate workflows, and more with a free Docker ID:
https://hub.docker.com/
For more examples and ideas, visit:
https://docs.docker.com/get-started/

OK, so basic docker stuff is working, what about GPU-based workloads?

$ docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark...> Windowed mode
> Simulation data stored in video memory
> Single precision floating point simulation
> 1 Devices used for simulation
MapSMtoCores for SM 7.5 is undefined. Default to use 64 Cores/SM
GPU Device 0: "GeForce RTX 2070" with compute capability 7.5
> Compute 7.5 CUDA device: [GeForce RTX 2070]
36864 bodies, total time for 10 iterations: 58.759 ms
= 231.275 billion interactions per second
= 4625.506 single-precision GFLOP/s at 20 flops per interaction

Looks good!

How about a Jupyter Notebook instance with GPU support?

docker run -it --gpus all -p 8888:8888 tensorflow/tensorflow:latest-gpu-py3-jupyter# Access it with the '127.0.0.1' URL provided by the output - but replace 127.0.0.1 with localhost!# eg: http://localhost:8888/?token=01234567890abdef

Try running the following code in a Jupyter Notebook to test it!

import sys
import numpy as np
import tensorflow as tf
from datetime import datetime
device_name = "gpu" # Choose device - Options: gpu or cpu
shape = (int(100), int(100))
if device_name == "gpu":
device_name = "/gpu:0"
else:
device_name = "/cpu:0"
tf.compat.v1.disable_eager_execution()
with tf.device(device_name):
random_matrix = tf.random.uniform(shape=shape, minval=0, maxval=1)
dot_operation = tf.matmul(random_matrix, tf.transpose(random_matrix))
sum_operation = tf.reduce_sum(dot_operation)
startTime = datetime.now()
with tf.compat.v1.Session(config=tf.compat.v1.ConfigProto(log_device_placement=True)) as session:
result = session.run(sum_operation)
print(result)
# Print the results
print("Shape:", shape, "Device:", device_name)
print("Time taken:", datetime.now() - startTime)

Not working for you? Make sure ‘/etc/docker/daemon.json’ looks like this, and restart the docker service/LxssManager:

{
"runtimes": {
"nvidia": {
"path": "nvidia-container-runtime",
"runtimeArgs": []
}
}
}

Still not working? Make sure you read over the Prerequisites section at the start of this article :)

For more reading/examples, check out NVIDIA’s documentation here: https://docs.nvidia.com/cuda/wsl-user-guide/index.html

--

--