Installing multiple CUDA + cuDNN versions in the same machine for Tensorflow and Pytorch

Published in

datatype

4 min readMay 5, 2021

This multiple CUDA versions arises when we need to do experiment on different deep learning projects on different Tensorflow, Pytorch versions. Of course, Docker is one of the first solution but it big, Anaconda limits the commercial ability, then we need to do something better.

On Windows or Linux, the typical CUDA + cuDNN setup is:

Install latest NVIDIA driver
Install required CUDA version
Add installed CUDA folders to the System Path
Download & extract required cuDNN version and do either copy to CUDA folders or add directly cuDNN folders to the System Path.

The benefit of Pytorch is it does not require installing the specific version of cuDNN. Now at May 2021, Pytorch requires CUDA 10.2 or 11.1 https://pytorch.org/get-started/locally/

Tensorflow is much more strict on CUDA and cuDNN version specification. According to https://www.tensorflow.org/install/source#gpu,
tensorflow-2.4.0 GPU needs cuDNN 8.0 and CUDA 11.0.

You need to install latest Nvidia Driver

Even the CUDA, cuDNN, and Nvidia driver also need to maintain these compatibility

CUDA Compatibility

CUDA Compatibility document describes the use of new CUDA toolkit components on systems with older base installations…

docs.nvidia.com

NVIDIA driver is the essential requirement for CUDA

New NVIDIA driver (for ex: R450) supports “old” CUDA version but it is not vice versa

Latest Nvidia driver on Ubuntu 20.04

Firstly, we need to remove all nvidia installation

sudo apt-get purge *nvidia*
sudo apt autoremove

Check Ubuntu drivers and auto install the recommended latest Nvidia driver

ubuntu-drivers devices
sudo ubuntu-drivers autoinstall

Difference CUDA version installation and maintainance

Now, when typing nvidia-smi you will see something like that:

NVIDIA-SMI 460.73.01 Driver Version: 460.73.01 CUDA Version: 11.2

CUDA version is usually higher than what we need and it is the confusion point.

In my case now, I need to install CUDA 11.1 for Pytorch and CUDA 11.0 + cuDNN 8.0 for Tensorflow 2.4

Download

Firstly, I need to download CUDA toolkit 11.0 and 11.1 run file (not deb) at

CUDA Toolkit Archive

Previous releases of the CUDA Toolkit, GPU Computing SDK, documentation and developer drivers can be found using the…

developer.nvidia.com

Then the download link for Cuda toolkit:

wget https://developer.download.nvidia.com/compute/cuda/11.0.3/local_installers/cuda_11.0.3_450.51.06_linux.runwget https://developer.download.nvidia.com/compute/cuda/11.1.1/local_installers/cuda_11.1.1_455.32.00_linux.run

I only need install cuDNN 8.0 for Cuda 11.0, all links can be found here:

cuDNN Archive

NVIDIA cuDNN is a GPU-accelerated library of primitives for deep neural networks.

developer.nvidia.com

Remember to download the .tgz file (no .deb again)

https://developer.nvidia.com/compute/machine-learning/cudnn/secure/8.0.5/11.0_20201106/cudnn-11.0-linux-x64-v8.0.5.39.tgz

Install CUDA toolkit

We need to install only the CUDA toolkit to the specific path that name: cuda-11.0 or cuda-11.1 rather than the default cuda already used for driver installation. If everything into the cuda folder, you will high likely face this error:

NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

Installation command:

sudo sh cuda_11.0.3_450.51.06_linux.run --silent --toolkit --toolkitpath=/usr/local/cuda-11.0

sudo sh cuda_11.1.1_455.32.00_linux.run --silent --toolkit --toolkitpath=/usr/local/cuda-11.1

Install cuDNN

I only need to install cuDNN 8.0 along with Cuda 11.0. The simplest is just extract the .tgz cudnn file and copy all files in the include and lib64 folder from cudnn to cuda 11.0 installed folder.

sudo cp cuda/include/cudnn*.h /usr/local/cuda-11.0/includesudo cp -P cuda/lib64/libcudnn* /usr/local/cuda-11.0/lib64sudo chmod a+r /usr/local/cuda-11.0/include/cudnn*.h /usr/local/cuda-11.0/lib64/libcudnn*

The default paths of cuDNN installation (maybe come along with the nvidia driver installation) are:

/usr/include
/usr/lib/x86_64-linux-gnu/

Cuda installation to system Path

We can do manually adding the corresponding version of the cuda installation to the system PATH and LD_LIBRARY_PATH.

The best automation way to add a specific version of Cuda path to system Path via .bashrc, is from this link

How to manage multiple versions of Cuda and cuDNN ?

I am writing this blog to give some insights on managing multiple versions of Cuda and some links. First of all, I must…

notesbyair.github.io

Simply add this to the end of your ~/.bashrc file

# add below to your env bash file.

function _switch_cuda {
   v=$1
   export PATH=$PATH:/usr/local/cuda-$v/bin
   export CUDADIR=/usr/local/cuda-$v
   export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-$v/lib64
   nvcc --version
}

And call this function to switch to a corresponding cuda version on your bash session

_switch_cuda 11.0 # change the version of your like to load bash.

Please remember to switch cuda version before running your source code.