NVIDIA Drivers, CUDA and TPU Libs

Now before adding the face recognition libs the Nvidia drivers, CUDA and cuDNN need to be installed (just CUDA, but might as well do the cuDNN libs now). You can use any combo of drivers and CUDA that are compatible, BUT REMEMBER, when we build OpenCV 4.5.4, DLib and ALPR system. They need to all be built with the same version of CUDA to not have a bunch of issues (ask me how I know).

I recommend using CUDA 10.2 with your choice of drivers, PyTorch (if needed in the future for a ‘full featured’ version with PyTorch model support) is compatible and the CUDA 10.<x> libs are well tested and compatible with most ML frameworks.

In this example, the system is headless (no monitor), I will be using Nvidia 460.67 drivers and CUDA 10.2 with the proper cuDNN libs (8.2.x for Cuda 10.2).

In my production environment, I use an unprivileged LXC for my ZM install and have mlapi in a separate privileged LXC. Mlapi uses a GPU and USB TPU, ZM uses CPU as a local fallback. I can not for the life of me figure out how to get a GPU passed through into an unprivileged LXC for ML use; I can see and access the GPU but when running detections it throws a cryptic Illegal Hardware Instruction Error . In a privileged LXC, I have no issues running detections, so MLAPI is not accessible from the Internet and ZMES sends detection requests locally. ZM is not directly accessible from the internet either, I have a reverse proxy serving traffic and use authelia for TOTP auth protection for the ZM web GUI frontend. Installing GPU and TPU into an LXC has its issues and if anyone is interested I can add a section that documents what that entails on a proxmox setup. The same goes for the authelia setup, it's awesome.

Grab the drivers you want (get the .run file) and then Cuda 10.2 (.run file), to grab cuDNN libs you will have to create an Nvidia developers account. There are choices for the Cuda 10.2 cuDNN libs; 8.2.0, 8.2.1 and 8.2.2. As far as I know, any of the three are suitable but I will go with 8.2.1. The cuDNN libs are tar.gz archives.

# First make the driver .run file executable, then run it
chmod +x ./NVIDIA-Linux-x86_64-460.67.run
sudo ./NVIDIA-Linux-x86_64-460.67.run

It takes a while to unpack itself and set up, there is somewhat of a progress bar though. The installer will ask you to accept the EULA by typing accept, and it may ask you to automatically disable Nouveau drivers, install some 32-bit libs, and some things about xconfig or libglvnd. Mostly the default answers are acceptable, the main thing is that at the end, there are no errors. It will show you a list where you can select Drivers, Cuda toolkit, examples and docs. Make sure Cuda is deselected!

Test if the drivers are up and running by first rebooting sudo reboot and once things come back up, nvidia-smi which should return something similar to this.

Nvidia drivers are now installed!

Now we move on to the Cuda install! Navigate to the folder where you have the Cuda .run file and start by making that executable.

chmod +x ./cuda_10.2.89_440.33.01_linux.run
sudo ./cuda_10.2.89_440.33.01_linux.run

Same thing here where it takes a while to unpack and greet you with an installer. Since we are using Cuda 10.2 the GCC version Ubuntu 21.04 has by default (gcc-10) will not work to compile Cuda. The error message will be something like this ->

Failed to verify gcc version. See log at /var/log/cuda-installer.log for details.# The cuda-installer.log file says something to this effect ->
[INFO]: Driver not installed.
[INFO]: Checking compiler version...[INFO]: gcc location: /usr/bin/gcc[INFO]: gcc version: gcc version 10.3.0 (Ubuntu 10.3.0-1ubuntu1)[ERROR]: unsupported compiler version: 10.3.0. Use --override to override this check.# --override will not work correctly so we need to install and configure the system to use an older gcc version. Version GCC-6/G++-6 is what is needed.

*** NEWER VERSIONS OF CUDA NOTE***

If you are using a newer version of CUDA and cuDNN libs, you DO NOT need to use GCC-6 to compile the newer versions of CUDA. GCC-9 or GCC-10 work fine and you can use the default Ubuntu GCC to compile CUDA, OpenCV, D-Lib and ALPR.

Using GCC-6 to build CUDA

Using 21.04 I was unable to find a repo that would allow me to cleanly install GCC 6. So installing .deb packages manually is the only option. You can search the ubuntu repos for GCC and g++ 6 packages and it will tell you the dependencies. If you have followed all the commands so far this is what you will need to do. This method may not work in the future but as of writing (October 2021), this method works.

See GCC 6 package in the bionic repos here. It lists all the dependencies, soft and hard. Go to the bottom of the page and select the arch for your ZMES host (usually amd64). Before you do that right-click some of the dependencies I have listed below and open their pages in new tabs. Download all the .deb packages I listed below and then follow the list IN ORDER to install gcc/g++ 6.

Select the correct architecture (most systems will be amd64)

After selecting your arch it will bring you to a list of mirrors that you can download the package from, RIGHT CLICK one of the mirror links and select ‘save link as…’. The File chooser will pop up and allow you to save the .deb file. Right away if you are using chrome or one of its variants, it will Deny the download. you can click the little arrow beside the Deny button and select ‘Keep’ to download the file.

Keep ‘harmful’ files
The other method is to right click the link and select 'Copy link address' and in your terminal where you are installing Cuda you can download the file right there by using wget.baudneo@ZMES-test:~$ wget http://mirrors.kernel.org/ubuntu/pool/universe/g/gcc-6/libstdc++-6-dev_6.4.0-17ubuntu1_amd64.deb--2021-10-19 20:10:26--  http://mirrors.kernel.org/ubuntu/pool/universe/g/gcc-6/libstdc++-6-dev_6.4.0-17ubuntu1_amd64.deb
Resolving mirrors.kernel.org (mirrors.kernel.org)... 198.145.21.9, 2001:4f8:4:6f:0:1994:3:14
Connecting to mirrors.kernel.org (mirrors.kernel.org)|198.145.21.9|:80... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: http://mirrors.edge.kernel.org/ubuntu/pool/universe/g/gcc-6/libstdc++-6-dev_6.4.0-17ubuntu1_amd64.deb [following]
--2021-10-19 20:10:26-- http://mirrors.edge.kernel.org/ubuntu/pool/universe/g/gcc-6/libstdc++-6-dev_6.4.0-17ubuntu1_amd64.deb
Resolving mirrors.edge.kernel.org (mirrors.edge.kernel.org)... 147.75.69.165, 2604:1380:1000:8100::1
Connecting to mirrors.edge.kernel.org (mirrors.edge.kernel.org)|147.75.69.165|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1412740 (1.3M) [application/octet-stream]
Saving to: 'libstdc++-6-dev_6.4.0-17ubuntu1_amd64.deb'libstdc++-6-dev_6.4.0-17ubuntu1_amd64.deb 100%[=========================================================================================================================================>] 1.35M 4.08MB/s in 0.3s2021-10-19 20:10:26 (4.08 MB/s) - 'libstdc++-6-dev_6.4.0-17ubuntu1_amd64.deb' saved [1412740/1412740]
-----------------------------------------------------------
Here is a list of the .deb packages I needed to manually install gcc/g++ 6. Install in the order of the list.sudo apt install ./libisl19_0.19-1_amd64.deb
sudo apt install ./gcc-6-base_6.4.0-17ubuntu1_amd64.deb
sudo apt install ./cpp-6_6.4.0-17ubuntu1_amd64.deb
# you can grab this next package from 21.04 repos, it handles a lot of the dependancies.
sudo apt install libgcc-6-dev
# Finally
sudo apt install ./gcc-6_6.4.0-17ubuntu1_amd64.deb
# GCC-6 is now installed, you can test by gcc-6 -v
baudneo@ZMES-test:~$ gcc-6 -v
gcc version 6.4.0 20180424 (Ubuntu 6.4.0-17ubuntu1)# Now for G++ 6
sudo apt install ./libstdc++-6-dev_6.4.0-17ubuntu1_amd64.deb
sudo apt install ./g++-6_6.4.0-17ubuntu1_amd64.deb
# G++-6 is now installed! test by g++-6 -vbaudneo@ZMES-test:~$ g++-6 -vgcc version 6.4.0 20180424 (Ubuntu 6.4.0-17ubuntu1)# Now it is time to configure the system to use GCC G++ 6
# This assumes you do not have other versions of gcc and g++ installed for other projects, if you do, you should know how to use the update-alternatives to use gcc-6 to compile cuda ;)
sudo update-alternatives --remove-all gcc
sudo update-alternatives --remove-all g++
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-6 100
sudo update-alternatives --set cc /usr/bin/gcc
sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-6 100
sudo update-alternatives --set c++ /usr/bin/g++
# When you want to revert these back to default gcc-10
sudo update-alternatives --remove-all gcc
sudo update-alternatives --remove-all g++
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-10 100
sudo update-alternatives --set cc /usr/bin/gcc
sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-10 100
sudo update-alternatives --set c++ /usr/bin/g++

You can now install the Cuda 10.2 toolkit by sudo ./cuda_10.2.89_440.33.01_linux.run

After typing in ‘accept’, make sure that the CUDA Toolkit is selected and that the drivers ARE NOT selected (unless you didn’t install drivers and want these?) and install. After install it will let you know if there were errors, if not, you need to add Cuda and the Cuda libs to some paths.

Make sure the driver is deselected if you installed a separate driver

Open your shell config file (.bashrc, .zshrc, etc.) and add the Cuda bin folder to your $PATH

nano ~/.zshrcPATH=$PATH:/usr/local/cuda-10.2/bin
LD_LIBRARY_PATH=/usr/local/cuda-10.2/lib64
# Exit the editor with ctrl+s then ctrl+x and then source the file to activatesource ~/.zshrc# test cudanvcc -Vnvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Wed_Oct_23_19:24:38_PDT_2019
Cuda compilation tools, release 10.2, V10.2.89
# Cuda is installed!

cuDNN Libs

Navigate to the folder where you have the cuDNN tar.gz archive.

tar -zxvf cudnn-10.2-linux-x64-v8.2.1.32.tgz# it will create a folder named 'cuda'sudo cp cuda/include/cudnn*.h /usr/local/cuda/include sudo cp -P cuda/lib64/libcudnn* /usr/local/cuda/lib64 sudo chmod a+r /usr/local/cuda/include/cudnn*.h /usr/local/cuda/lib64/libcudnn*# cuDNN is now installed!

DO NOT REVERT TO USING GCC-10 YET, it is needed for OpenCV, DLib and ALPR compilation.

Install TPU coral libs

I would install the coral libs before installing ZMES and MLAPI.

# Make sure the TPU is unplugged from USB port for nowecho "deb https://packages.cloud.google.com/apt coral-edgetpu-stable main" | sudo tee /etc/apt/sources.list.d/coral-edgetpu.listcurl https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo 
apt-key add -
sudo apt update# This is the std lib, you can use the max lib to overclock the TPU (std is fine, ive never used the max lib). I do reccomend a USB3 port though, it takes 1 to 1.5 seconds off the detections from a USB2 port.sudo apt-get install libedgetpu1-std# Now plug the TPU in (or unplug and replug in if it is already installed, so the new udev rule can take affect). Install the python coral wrapper libraries.sudo apt-get install python3-pycoral# add the www-data user to the plugdev groupsudo usermod -aG plugdev www-data

The coral TPU is now installed and should work, you can test it by ->

mkdir coral && cd coralgit clone https://github.com/google-coral/pycoral.gitcd pycoralbash examples/install_requirements.sh classify_image.pypython3 examples/classify_image.py \ --model test_data/mobilenet_v2_1.0_224_inat_bird_quant_edgetpu.tflite \ --labels test_data/inat_bird_labels.txt \ --input test_data/parrot.jpg# There should be some output like this ->
----INFERENCE TIME----
Note: The first inference on Edge TPU is slow because it includes loading the model into Edge TPU memory.
11.8ms
3.0ms
2.8ms
2.9ms
2.9ms
-------RESULTS--------
Ara macao (Scarlet Macaw): 0.75781

--

--