Installing CUDA Toolkit 10.0 and cuDNN for Deep learning with Tensorflow-gpu on Ubuntu 18.04+ LTS
Before starting this post, I’d like to thank Christian Janze, since his post How to install TensorFlow 1.13 with GPU support on Ubuntu 18.04 LTS + CUDA 10.0 helped me a lot in installing CUDA toolkit on my own laptop and this post is inspired from his one.
Step 1 : Install Nvidia’s latest Linux Display Driver via PPA for your Graphics card.
Firstly, you should check, which Graphics Card is there on your system. To do this on your terminal type :
$ lspci | grep “VGA”
The output will look something like this:
Looking at the output, I have a system with hybrid graphics with both Intel and Nvidia graphics. More importantly, my system has a Nvidia Graphics card, which is GeForce GTX 1050 Ti Mobile. Should I be happy just yet?
It is good if your system has a Nvidia graphics card, that’s just stage 1 cleared, but we still need to check whether your graphics card is CUDA supported. You can go to this link — CUDA GPUs and see whether your graphics card is a CUDA gpu or not.
Also, if your system has hybrid graphics, just like mine, then you should open nvidia-prime-select by using the super key/windows key and select nvidia.
You can also check from the command line using:
$ prime-select query
If the answer is nvidia
, then you are good to go and you can proceed to install the driver.
Else, if it is intel
change by using :
$prime-select nvidia
How to check which is the latest driver to install?
Just go to Nvidia Drivers Download and enter your GPU configuration to get to know, which is the latest display driver for your GPU.
Note: 1050 Ti is part of the GeForce 10 series.
Now we come to the meat part of installing the driver.
For the driver :
Using 18.04+ To install run the following command:
$ sudo add-apt-repository ppa:graphics-drivers/ppa
This will automatically update the repositories and then you can run the following line:
$ sudo apt install nvidia-driver-430 # For Acer Nitro Configuration.
Note: My latest driver was 430, so I installed using nvidia-driver-430. Also, if you face any problems with dependencies try using aptitude instead of apt, it will handle all unmet dependencies and conflicts.
# If you face any errors such as your desktop doesn’t load then,
See the following link:
How do I install Nvidia drivers — askubuntu
Then Reboot your laptop :
$ reboot or $ poweroff
Then check command nvidia-smi:
$ nvidia-smi
If it doesn’t work its most probably due to secure boot being on using UEFI mode.
To disable secure boot on ubuntu follow the instructions at:
Disable secure boot
Once secure boot is disabled you’ll see, booting in insecure mode at the top-left corner of ubuntu
grub menu and now nvidia-smi should be working.
$nvidia-smi
The result should be something like below:
You’ll see some of the processes running like Xorg, usrbin/gnome-shell already running. And Driver Version: 430, like my example and CUDA version : 10.2 or some other version, this maybe somewhat confusing, since CUDA toolkit’s latest driver is 10.1 (as of now), and we haven’t even installed CUDA toolkit, yet so why we see CUDA version. The answer is here:
Step 1 : Installing CUDA toolkit 10.0 via runfile
Before, going on to install CUDA toolkit 10.0, first we have to offload any processes on GPU as you have seen above like Xorg, gnome-shell, etc.
Follow the instructions from the first answer at offloading gpu memory from other processes, you’re GUI will be disabled after the first instruction and be replaced with CLI, so keep the instructions open on a separate screen too.
Then run $ nvidia-smi command , you should get an output like shown below, with no processes:
Go to to the link below and when the download link comes go to the legacy version and download CUDA 10.0 runfile, since at the time being Tensorflow only has support for CUDA 10.0.
Install it by running the following command:
$ sudo sh cuda_10.[YOURVERSION]_linux.run
Skip the licence agreement by pressing CTRL + C.
Don’t install the display driver already packaged with CUDA toolkit i.e, answer “no”, when it asks to ( Since, we have already installed the latest driver and the driver offered by CUDA toolkit is an older version). Then wait for the installation to finish.
Setup the environment variables:
$ sudo nano ~/.bashrc
Add the following paths at the end:
export PATH=/usr/local/cuda-10.0/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-10.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
CTRL + O to save and then ENTER , and then CTRL + X, to exit nano.
Then load the .bashrc file by:
$ source ~/.bashrc
Verify the installation of Nvidia’s CUDA Toolkit 10 compiler driver.
$ nvcc -V
Test the installation of Nvidia’s CUDA Toolkit 10
$ cd ~/NVIDIA_CUDA-10.0_Samples
$ make
After you see “Finished building CUDA samples”. Enter the following commands.
$ cd ~/NVIDIA_CUDA-10.0_Samples/bin/x86_64/linux/release
$./deviceQuery
Result = PASSED, then CUDA has been installed. If you reached this step, then you have crossed the dungeon and on your way to meet the princess. :)
Step 3 : Installing Nvidia cuDNN
Download from Nvidia cuDNN.
Download the following files to your downloads folder (Latest files). Watch out for CUDA toolkit version at the end like CUDA 10.0.
- cuDNN Runtime library for Ubuntu 18.04 (Deb)
- cuDNN Developer library for Ubuntu 18.04 (Deb)
- cuDNN Code Samples and User Guide for Ubuntu18.04 (Deb)
Install cuDNN:
# Replace with your versions of the .deb files.
$ cd ~/Downloads
$ sudo dpkg -i libcudnn7_7.6.0.64–1+cuda10.0_amd64.deb
$ sudo dpkg -i libcudnn7-dev_7.6.0.64–1+cuda10.0_amd64.deb
$ sudo dpkg -i libcudnn7-doc_7.6.0.64–1+cuda10.0_amd64.deb
Verify the cuDNN installation:
$ cp -r /usr/src/cudnn_samples_v7/ $HOME
$ cd $HOME/cudnn_samples_v7/mnistCUDNN
$ make clean && make
$ ./mnistCUDNN
Step 4 : Install Tensorflow-gpu
For installing tensorflow-gpu on your system.
Using PIP virtualenv/ System Wide installation:
Follow the instructions as given in the link below:
https://www.tensorflow.org/install/pip
Using Conda Environment
I tried using tensorflow with conda virtual environment, but sadly it doesn’t show GPU device. Would be great, if anyone can try it out and tell me how it goes for them.
I had also earlier tried to install CUDA toolkit and cuDNN from conda following Harveen Singh’s post, but it didn’t work out for me. On installing tensorflow-gpu in my conda environment and doing conda list
, I didn’t see cuDNN or CUDA toolkit in the packages. It’d great if it works out for the other users, since it’s a one liner, but I sincerely think it’s great if you can learn the hard way of doing things like installing CUDA toolkit and cuDNN.
How to check if gpu device is being used or not in your system?
Note: If you created a pip virtualenv then you’d have to activate the environment like :
source tf_gpu/bin/activate
(if tf_gpu is the name of your environment and you made it in the$HOME
directory).
In your python shell run :
from tensorflow.python.client import device_lib
print(device_lib.list_local_devices())
This completes your installation, and you’ve finally met your princess and now it’s time to tweak neurons and layers with her.
Also, your nvidia-smi
will look like the following:
Would love to hear from other users! You can connect with me on Linkedin!