Installing multiple CUDA + cuDNN versions in the same machine for Tensorflow and Pytorch

Nguyễn Văn Lĩnh
datatype
Published in
4 min readMay 5, 2021

This multiple CUDA versions arises when we need to do experiment on different deep learning projects on different Tensorflow, Pytorch versions. Of course, Docker is one of the first solution but it big, Anaconda limits the commercial ability, then we need to do something better.

On Windows or Linux, the typical CUDA + cuDNN setup is:

  • Install latest NVIDIA driver
  • Install required CUDA version
  • Add installed CUDA folders to the System Path
  • Download & extract required cuDNN version and do either copy to CUDA folders or add directly cuDNN folders to the System Path.

The benefit of Pytorch is it does not require installing the specific version of cuDNN. Now at May 2021, Pytorch requires CUDA 10.2 or 11.1 https://pytorch.org/get-started/locally/

Tensorflow is much more strict on CUDA and cuDNN version specification. According to https://www.tensorflow.org/install/source#gpu,
tensorflow-2.4.0 GPU needs cuDNN 8.0 and CUDA 11.0.

You need to install latest Nvidia Driver

Even the CUDA, cuDNN, and Nvidia driver also need to maintain these compatibility

NVIDIA driver is the essential requirement for CUDA
New NVIDIA driver (for ex: R450) supports “old” CUDA version but it is not vice versa

Latest Nvidia driver on Ubuntu 20.04

Firstly, we need to remove all nvidia installation

sudo apt-get purge *nvidia*
sudo apt autoremove

Check Ubuntu drivers and auto install the recommended latest Nvidia driver

ubuntu-drivers devices
sudo ubuntu-drivers autoinstall

Difference CUDA version installation and maintainance

Now, when typing nvidia-smi you will see something like that:

NVIDIA-SMI 460.73.01 Driver Version: 460.73.01 CUDA Version: 11.2

CUDA version is usually higher than what we need and it is the confusion point.

In my case now, I need to install CUDA 11.1 for Pytorch and CUDA 11.0 + cuDNN 8.0 for Tensorflow 2.4

Download

Firstly, I need to download CUDA toolkit 11.0 and 11.1 run file (not deb) at

Need to download the runfile

Then the download link for Cuda toolkit:

wget https://developer.download.nvidia.com/compute/cuda/11.0.3/local_installers/cuda_11.0.3_450.51.06_linux.runwget https://developer.download.nvidia.com/compute/cuda/11.1.1/local_installers/cuda_11.1.1_455.32.00_linux.run

I only need install cuDNN 8.0 for Cuda 11.0, all links can be found here:

Remember to download the .tgz file (no .deb again)

https://developer.nvidia.com/compute/machine-learning/cudnn/secure/8.0.5/11.0_20201106/cudnn-11.0-linux-x64-v8.0.5.39.tgz

Install CUDA toolkit

We need to install only the CUDA toolkit to the specific path that name: cuda-11.0 or cuda-11.1 rather than the default cuda already used for driver installation. If everything into the cuda folder, you will high likely face this error:

NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

Installation command:

sudo sh cuda_11.0.3_450.51.06_linux.run --silent --toolkit --toolkitpath=/usr/local/cuda-11.0

sudo sh cuda_11.1.1_455.32.00_linux.run --silent --toolkit --toolkitpath=/usr/local/cuda-11.1

Install cuDNN

I only need to install cuDNN 8.0 along with Cuda 11.0. The simplest is just extract the .tgz cudnn file and copy all files in the include and lib64 folder from cudnn to cuda 11.0 installed folder.

sudo cp cuda/include/cudnn*.h /usr/local/cuda-11.0/includesudo cp -P cuda/lib64/libcudnn* /usr/local/cuda-11.0/lib64sudo chmod a+r /usr/local/cuda-11.0/include/cudnn*.h /usr/local/cuda-11.0/lib64/libcudnn*

The default paths of cuDNN installation (maybe come along with the nvidia driver installation) are:

/usr/include
/usr/lib/x86_64-linux-gnu/

Cuda installation to system Path

We can do manually adding the corresponding version of the cuda installation to the system PATH and LD_LIBRARY_PATH.

The best automation way to add a specific version of Cuda path to system Path via .bashrc, is from this link

Simply add this to the end of your ~/.bashrc file

# add below to your env bash file.

function _switch_cuda {
v=$1
export PATH=$PATH:/usr/local/cuda-$v/bin
export CUDADIR=/usr/local/cuda-$v
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-$v/lib64
nvcc --version
}

And call this function to switch to a corresponding cuda version on your bash session

_switch_cuda 11.0 # change the version of your like to load bash.

Please remember to switch cuda version before running your source code.

Reference:

--

--