mlapi_cudnn — GPU and TPU passthrough to Unprivileged LXC (Proxmox 7.0+)

5 min readDec 11, 2021

GPU/TPU passthrough to LXC

This example is using Proxmox as the LXC host and 480.67 as the Nvidia driver version. Make sure that the nesting and keyctl options are enabled in the LXC options if you are planning on using Docker inside of the LXC.

First, get some drivers that are compatible with CUDA 11.4+. It is important to note that you MUST use the same version drivers inside the LXC as are used on the Proxmox host. The Nvidia .run installer is being used here.

You will need the pve-headers for the kernel in use. uname -r will give you your kernel version, therefore apt install pve-headers-$(uname -r) will install the headers required for Proxmox.

Install the drivers by executing the .run file. If the file is not executable chmod +x NVIDIA-Linux-x86_64–470.86.run will make it executable.

/path/to/NVIDIA-Linux-x86_64-470.86.run

Following the instructions on the screen, you can use the default answers for all the questions. Once the install is finished locate your GPU using lspci. In this example, it finds a 1660ti.

lspci | grep -i nvidia
                                                                                                                                      
06:00.0 VGA compatible controller: NVIDIA Corporation TU116 [GeForce GTX 1660 Ti] (rev a1)
06:00.1 Audio device: NVIDIA Corporation TU116 High Definition Audio Controller (rev a1)
06:00.2 USB controller: NVIDIA Corporation TU116 USB 3.1 Host Controller (rev a1)
06:00.3 Serial bus controller [0c80]: NVIDIA Corporation TU116 USB Type-C UCSI Controller (rev a1)

Open the /etc/modules-load.d/nvidia.conf file and confirm it has these lines, if not add them.

nvidia-drm
nvidia
nvidia_uvm

Create a udev rule /etc/udev/rules.d/70-nvidia.rules

# Create /nvidia0, /dev/nvidia1 … and /nvidiactl when nvidia module is loadedKERNEL=="nvidia", RUN+="/bin/bash -c '/usr/bin/nvidia-smi -L && /bin/chmod 666 /dev/nvidia*'"# Create the CUDA node when nvidia_uvm CUDA module is loaded KERNEL=="nvidia_uvm", RUN+="/bin/bash -c '/usr/bin/nvidia-modprobe -c0 -u && /bin/chmod 0666 /dev/nvidia-uvm*'"

Reboot the host and check the output of ls -al /dev/nvidia* and /dev/dri/*. The output should be similar to this ->

> ls -al /dev/nvidia*
crw-rw-rw- 1 root root  195,   0 Dec  2 13:51 /dev/nvidia0
crw-rw-rw- 1 root root  195, 255 Dec  2 13:51 /dev/nvidiactl
crw-rw-rw- 1 root root  195, 254 Dec  2 13:51 /dev/nvidia-modeset
crw-rw-rw- 1 root root  510,   0 Dec  2 13:51 /dev/nvidia-uvm
crw-rw-rw- 1 root root  510,   1 Dec  2 13:51 /dev/nvidia-uvm-tools/dev/nvidia-caps:
total 0
drw-rw-rw-  2 root video     80 Dec  2 13:51 .
drwxr-xr-x 24 root root    5320 Dec  8 18:49 ..
cr--------  1 root root  236, 1 Dec  2 13:51 nvidia-cap1
cr--r--r--  1 root root  236, 2 Dec  2 13:51 nvidia-cap2> ls -al /dev/dri/*
crw-rw---- 1 root video  226,   0 Dec  8 00:54 /dev/dri/card0
crw-rw---- 1 root video  226,   1 Dec  8 00:54 /dev/dri/card1
crw-rw---- 1 root render 226, 128 Dec  8 00:54 /dev/dri/renderD128/dev/dri/by-path:
total 0
drwxr-xr-x 2 root root 100 Dec  2 13:51 .
drwxr-xr-x 3 root root 120 Dec  2 13:51 ..
lrwxrwxrwx 1 root root   8 Dec  8 00:54 pci-0000:06:00.0-card -> ../card1
lrwxrwxrwx 1 root root  13 Dec  8 00:54 pci-0000:06:00.0-render -> ../renderD128
lrwxrwxrwx 1 root root   8 Dec  8 00:54 pci-0000:08:03.0-card -> ../card0

The cgroup2 numbers needed for passing through the GPU are in column 5 195 510 236 226.

Also, test access to the GPU using the nvidia-smi command. It should output something similar to this.

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.86       Driver Version: 470.86       CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:06:00.0 Off |                  N/A |
| 38%   45C    P2    24W / 130W |   2389MiB /  5944MiB |      6%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------++-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      2746      G   /usr/lib/xorg/Xorg                  6MiB |
|    0   N/A  N/A    326039      C   python3                          1121MiB |
|    0   N/A  N/A   3327033      C   /usr/bin/zmc                      651MiB |
|    0   N/A  N/A   3327041      C   /usr/bin/zmc                      243MiB |
|    0   N/A  N/A   3327049      C   /usr/bin/zmc                      243MiB |
|    0   N/A  N/A   3327055      C   /usr/bin/zmc                      118MiB |
+-----------------------------------------------------------------------------+

Proxmox 7.0 uses cgroup2 instead of cgroup. Create a new Ubuntu 20.04 unprivileged LXC and before you start it up, edit the .conf file for it. In this example, the LXC container ID is 106. nano /etc/pve/lxc/106.conf

# Add these lines to passthrough the GPU - you may have more or less devices. cgroup permissions are set here.# Use the cgroup numbers from the ls -aln /dev/nvida etc.
lxc.cgroup2.devices.allow: c 195:* rwm
lxc.cgroup2.devices.allow: c 234:* rwm
lxc.cgroup2.devices.allow: c 237:* rwm
lxc.cgroup2.devices.allow: c 510:* rwm
lxc.mount.entry: /dev/nvidia0 dev/nvidia0 none bind,optional,create=file
lxc.mount.entry: /dev/nvidiactl dev/nvidiactl none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm dev/nvidia-uvm none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-modeset dev/nvidia-modeset none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm-tools dev/nvidia-uvm-tools none bind,optional,create=file
lxc.mount.entry: /dev/dri dev/dri none bind,optional,create=dir
lxc.mount.entry: /dev/nvidia-caps dev/nvidia-caps none bind,optional,create=dir

Start the LXC > pct start 106 and then enter it pct enter 106. You should be dropped into a shell, get the NVIDIA driver .run file inside of the LXC (scp or mount a shared folder, etc.). Install the drivers BUT use the --no-kernel-module flag.

/path/to/NVIDIA-Linux-x86_64-470.86.run --no-kernel-module

Execute nvidia-smi inside of the LXC, to test access to the GPU. There will be some differences, you will not see the process list but you will see the available RAM, CPU utilization, fan speed, etc.

For TPU access inside of a Docker environment you need to create a udev rule to change the TPU file device permissions to 0666. In my testing, it's easier to have the ‘other’ flag set to RW when passing through a TPU from the LXC into the Docker container.

nano /etc/udev/rules.d/99-edgetpu-accelerator.rules

SUBSYSTEM=="usb", ATTRS{idVendor}=="18d1", ATTRS{idProduct}=="9302", SYMLINK+="tpu", MODE="0666"

This will also create a symlink at /dev/tpu. The symlink can’t be used to pass through into the LXC, it's just a convenience. After doing that you need to edit the .conf file to allow access to the USB file device for the TPU. nano /etc/pve/lxc/106.conf

# Passthrough whole USB system, or you can specify bus and or host.
# Remember that the bus and host address can change across rebootslxc.mount.entry: /dev/bus/usb dev/bus/usb none bind,optional,create=dir 0,0# These 2 lines seem to be needed in Proxmox 7.0+
lxc.mount.auto: cgroup:rw
lxc.cgroup2.devices.allow: a

Stop and start the container to allow things to take effect after editing the file. You can now install the TPU library (and PyCoral).

mlapi_cudnn — GPU and TPU passthrough to Unprivileged LXC (Proxmox 7.0+)

GPU/TPU passthrough to LXC

Written by baudneo