mlapi_cudnn — GPU and TPU passthrough to Unprivileged LXC (Proxmox 7.0+)

baudneo
5 min readDec 11, 2021

--

GPU/TPU passthrough to LXC

This example is using Proxmox as the LXC host and 480.67 as the Nvidia driver version. Make sure that the nesting and keyctl options are enabled in the LXC options if you are planning on using Docker inside of the LXC.

First, get some drivers that are compatible with CUDA 11.4+. It is important to note that you MUST use the same version drivers inside the LXC as are used on the Proxmox host. The Nvidia .run installer is being used here.

You will need the pve-headers for the kernel in use. uname -r will give you your kernel version, therefore apt install pve-headers-$(uname -r) will install the headers required for Proxmox.

Install the drivers by executing the .run file. If the file is not executable chmod +x NVIDIA-Linux-x86_64–470.86.run will make it executable.

/path/to/NVIDIA-Linux-x86_64-470.86.run

Following the instructions on the screen, you can use the default answers for all the questions. Once the install is finished locate your GPU using lspci. In this example, it finds a 1660ti.

lspci | grep -i nvidia

06:00.0 VGA compatible controller: NVIDIA Corporation TU116 [GeForce GTX 1660 Ti] (rev a1)
06:00.1 Audio device: NVIDIA Corporation TU116 High Definition Audio Controller (rev a1)
06:00.2 USB controller: NVIDIA Corporation TU116 USB 3.1 Host Controller (rev a1)
06:00.3 Serial bus controller [0c80]: NVIDIA Corporation TU116 USB Type-C UCSI Controller (rev a1)

Open the /etc/modules-load.d/nvidia.conf file and confirm it has these lines, if not add them.

nvidia-drm
nvidia
nvidia_uvm

Create a udev rule /etc/udev/rules.d/70-nvidia.rules

# Create /nvidia0, /dev/nvidia1 … and /nvidiactl when nvidia module is loadedKERNEL=="nvidia", RUN+="/bin/bash -c '/usr/bin/nvidia-smi -L && /bin/chmod 666 /dev/nvidia*'"# Create the CUDA node when nvidia_uvm CUDA module is loaded KERNEL=="nvidia_uvm", RUN+="/bin/bash -c '/usr/bin/nvidia-modprobe -c0 -u && /bin/chmod 0666 /dev/nvidia-uvm*'"

Reboot the host and check the output of ls -al /dev/nvidia* and /dev/dri/*. The output should be similar to this ->

> ls -al /dev/nvidia*
crw-rw-rw- 1 root root 195, 0 Dec 2 13:51 /dev/nvidia0
crw-rw-rw- 1 root root 195, 255 Dec 2 13:51 /dev/nvidiactl
crw-rw-rw- 1 root root 195, 254 Dec 2 13:51 /dev/nvidia-modeset
crw-rw-rw- 1 root root 510, 0 Dec 2 13:51 /dev/nvidia-uvm
crw-rw-rw- 1 root root 510, 1 Dec 2 13:51 /dev/nvidia-uvm-tools
/dev/nvidia-caps:
total 0
drw-rw-rw- 2 root video 80 Dec 2 13:51 .
drwxr-xr-x 24 root root 5320 Dec 8 18:49 ..
cr-------- 1 root root 236, 1 Dec 2 13:51 nvidia-cap1
cr--r--r-- 1 root root 236, 2 Dec 2 13:51 nvidia-cap2
> ls -al /dev/dri/*
crw-rw---- 1 root video 226, 0 Dec 8 00:54 /dev/dri/card0
crw-rw---- 1 root video 226, 1 Dec 8 00:54 /dev/dri/card1
crw-rw---- 1 root render 226, 128 Dec 8 00:54 /dev/dri/renderD128
/dev/dri/by-path:
total 0
drwxr-xr-x 2 root root 100 Dec 2 13:51 .
drwxr-xr-x 3 root root 120 Dec 2 13:51 ..
lrwxrwxrwx 1 root root 8 Dec 8 00:54 pci-0000:06:00.0-card -> ../card1
lrwxrwxrwx 1 root root 13 Dec 8 00:54 pci-0000:06:00.0-render -> ../renderD128
lrwxrwxrwx 1 root root 8 Dec 8 00:54 pci-0000:08:03.0-card -> ../card0

The cgroup2 numbers needed for passing through the GPU are in column 5 195 510 236 226.

Also, test access to the GPU using the nvidia-smi command. It should output something similar to this.

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.86 Driver Version: 470.86 CUDA Version: 11.4 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... Off | 00000000:06:00.0 Off | N/A |
| 38% 45C P2 24W / 130W | 2389MiB / 5944MiB | 6% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 2746 G /usr/lib/xorg/Xorg 6MiB |
| 0 N/A N/A 326039 C python3 1121MiB |
| 0 N/A N/A 3327033 C /usr/bin/zmc 651MiB |
| 0 N/A N/A 3327041 C /usr/bin/zmc 243MiB |
| 0 N/A N/A 3327049 C /usr/bin/zmc 243MiB |
| 0 N/A N/A 3327055 C /usr/bin/zmc 118MiB |
+-----------------------------------------------------------------------------+

Proxmox 7.0 uses cgroup2 instead of cgroup. Create a new Ubuntu 20.04 unprivileged LXC and before you start it up, edit the .conf file for it. In this example, the LXC container ID is 106. nano /etc/pve/lxc/106.conf

# Add these lines to passthrough the GPU - you may have more or less devices. cgroup permissions are set here.# Use the cgroup numbers from the ls -aln /dev/nvida etc.
lxc.cgroup2.devices.allow: c 195:* rwm
lxc.cgroup2.devices.allow: c 234:* rwm
lxc.cgroup2.devices.allow: c 237:* rwm
lxc.cgroup2.devices.allow: c 510:* rwm
lxc.mount.entry: /dev/nvidia0 dev/nvidia0 none bind,optional,create=file
lxc.mount.entry: /dev/nvidiactl dev/nvidiactl none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm dev/nvidia-uvm none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-modeset dev/nvidia-modeset none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm-tools dev/nvidia-uvm-tools none bind,optional,create=file
lxc.mount.entry: /dev/dri dev/dri none bind,optional,create=dir
lxc.mount.entry: /dev/nvidia-caps dev/nvidia-caps none bind,optional,create=dir

Start the LXC > pct start 106 and then enter it pct enter 106. You should be dropped into a shell, get the NVIDIA driver .run file inside of the LXC (scp or mount a shared folder, etc.). Install the drivers BUT use the --no-kernel-module flag.

/path/to/NVIDIA-Linux-x86_64-470.86.run --no-kernel-module

Execute nvidia-smi inside of the LXC, to test access to the GPU. There will be some differences, you will not see the process list but you will see the available RAM, CPU utilization, fan speed, etc.

For TPU access inside of a Docker environment you need to create a udev rule to change the TPU file device permissions to 0666. In my testing, it's easier to have the ‘other’ flag set to RW when passing through a TPU from the LXC into the Docker container.

nano /etc/udev/rules.d/99-edgetpu-accelerator.rules

SUBSYSTEM=="usb", ATTRS{idVendor}=="18d1", ATTRS{idProduct}=="9302", SYMLINK+="tpu", MODE="0666"

This will also create a symlink at /dev/tpu. The symlink can’t be used to pass through into the LXC, it's just a convenience. After doing that you need to edit the .conf file to allow access to the USB file device for the TPU. nano /etc/pve/lxc/106.conf

# Passthrough whole USB system, or you can specify bus and or host.
# Remember that the bus and host address can change across reboots
lxc.mount.entry: /dev/bus/usb dev/bus/usb none bind,optional,create=dir 0,0# These 2 lines seem to be needed in Proxmox 7.0+
lxc.mount.auto: cgroup:rw
lxc.cgroup2.devices.allow: a

Stop and start the container to allow things to take effect after editing the file. You can now install the TPU library (and PyCoral).

--

--