Deep Learning Software Installation Guide

How to install Python and Nvidia drivers, libraries and packages on a bare metal Ubuntu machine.

Due to my upcoming dissertation on Reinforcement Learning, I recently built a Ubuntu and Nvidia based DL computer. Although there were plenty of great guides (#thanks all), there were no comprehensive installation instructions. In addition, I had to read a lot of documentation to understand details and specifics — some of which were incomplete or contained syntax errors. I thus decided to document what I ended up doing by congealing instructions from various sources (references provided).

The guide will inform you how to install

  • Operating System (Ubuntu)
  • 4 drivers and libraries (GPU driver, CUDA, cuDNN and pip)
  • 5 Python DL libraries (TensorFlow, Theano, CNTK, Keras and PyTorch)

Each package relies on dependencies shown by the following diagram. Only one Python DL Library needs to be installed so feel free to omit sections as you choose.

Installation and Dependency Stack for Deep Learning Software

In more detail, here are what each of the components do:

  1. Ubuntu (v16.04.3)— Operating system, schedules processes.
  2. Nvidia GPU driver (v375)— Enables the OS to operate the GPU.
  3. CUDA (v8.0)— GPU C library. Stands for Compute Unified Device Architecture.
  4. cuDNN (v6.0.21) — DL primitives library based on CUDA. Stands for CUDA Deep Neural Network.
  5. pip (v9.0.1) — Python package installer. Stands for Pip Installs Packages.
  6. TensorFlow (v1.3)— DL library, developed by Google.
  7. Theano (v0.9.0)— Mathematical Library running on GPUs.
  8. CNTK (v2.2) — DL Framework developed by Microsoft Research.
  9. Keras (v2.0.8)— DL wrapper with interchangeable backends. Can be used with TensorFlow, Theano or CNTK.
  10. PyTorch (v0.2.0)— Dynamic DL library with GPU acceleration.

1. Install Ubuntu 16.04.3

Following this section results in a clean install, overwriting pre-existing partitions or OSs.

v 16.04.3 was installed from a bootable USB because it was the latest LTS (long-term support) version. When powering on the computer for the first time, boot from the USB by accessing the boot menu and selecting the USB.

My build featured two hard disks, a 1 TB SATA and a 256 GB SSD. In my build, Ubuntu was installed in the 1 TB hard disk, such that the SSD would be free for datasets, speeding up training. During installation, within the screen Installation Type, I selected Something else, allowing me to create the following three partitions.

Boot Partition (128 GB): Contains system files, program settings and documents.

Swap Partition (2x RAM size): For me, this was 128 GB. The memory is used to extend Kernel RAM as virtual memory.

User Partition (rest): The free space in my 1 TB hard drive was 744 GB.

After installing, it is good to run the following commands to upgrade the kernel version.

sudo apt-get update
sudo apt-get upgrade

2. Install Nvidia GPU driver

After installing Ubuntu, you may notice that the screen displayed has an incorrect resolution, which you cannot change. This is because the video output coming from the GPU has no driver and is not configured.

There are two ways of installing the driver, from an Ubuntu repository and from source. The first method is easier but requires frequent reinstallation. When the commands sudo apt-get update and sudo apt-get upgrade are called, it causes the kernel to update itself. This does not update the Nvidia driver and will result in the GUI failing to load properly. Installing from source negates this issue.

Installing v375 from a package (easier)

The following command lists compatible driver versions with your system. Two numbers were given: the latest and the long-term release version number. The versions are listed at the start, so be sure to scroll up.

sudo add-apt-repository ppa:graphics-drivers/ppa

Add and install the repository. Within the second command, change the <driver_number> to the version you want to install. Installing the latest long release — 375 is recommended.

sudo apt-get update
sudo apt-get install nvidia-<driver_number>

Restart the computer to reconfigure the video output at startup.

sudo shutdown -r now

To test if the driver is working, Screen Display (SUPERKEY, then type screen display) should now recognize the monitor you are using, enabling you to change its configuration, resolution and orientation.

Install v384.90 from Nvidia source (harder)

Download the latest driver version from the Nvidia website. For me, the options I selected were:

GeForce -> GeForce 10 Series -> GeForce GTX 1080 -> Linux 64 bit -> English (UK)

Optional prerequisites, enabling compilation to a 32-bit architecture and a development version of the GUI.

sudo apt-get install gcc-multilib xorg-dev

Press CTRL + ALT + F1 and login. This switches from the GUI to a terminal. In order to rebuild the video output, it must first be halted.

sudo service lightdm stop

If the commands does not work, newer versions of Ubuntu use systemctl instead of lightdm. Then make the runfile executable and run it.

cd <download location>
chmod +x NVIDIA-Linux-x86_64-384.90.run
sudo ./NVIDIA-Linux-x86_64-384.90.run --dkms

When running, you may get a pre-install script failed message. This does not matter because the pre-install script contains one command: exit 1. Its purpose is merely to ensure that you really want to install the driver.

The option --dkms (which should be on by default) is prevents a reinstallation of the driver when the kernel updates itself by installing the driver into a module. During a kernel update, dkms triggers the driver to recompile to the new kernel module stack.

If the installation fails, it is because Secure Boot is not disabled from your computer’s BIOS. Restart the computer and disable Secure Boot from the BIOS option during startup.

If the installation succeeds, you can now restart the GUI.

sudo service lightdm start

To uninstall:sudo ./NVIDIA-Linux-x86_64-384.90.run --uninstall

Verification

Ensure that the following command recognises the correct GPU version

nvidia-smi

Ensure the driver version number is the one you installed

cat /proc/driver/nvidia/version

3. Install CUDA 8.0

From the Nvidia website, download the runfile for CUDA using the following system properties:

Linux -> x86_64 -> Ubuntu -> 16.04 -> .deb(network)

After navigating to the location of the .deb file, depackage .deb, update the package list and install CUDA using the following commands.

sudo dpkg -i cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64.deb
sudo apt-get update
sudo apt-get install cuda

To prevent apt from marking these as dependencies which are later autoremoved, execute

sudo apt-mark manual cuda-\*

Add the library to the bash path, such that it can be found by other applications.

echo 'export PATH=/usr/local/cuda-8.0/bin${PATH:+:${PATH}}' >> ~/.bashrc
echo '
export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}' >> ~/.bashrc
source ~/.bashrc

To verify, ensure the Nvidia C Compiler version (nvcc) matches that of CUDA using nvcc -V

Restart the computer sudo shutdown -r now to complete the installation.

Optional: Test CUDA Installation

A method of testing the installation is to run some examples. The following commands create a directory called test_CUDA where the example programs will be stored.

mkdir test_CUDA
cd test_CUDA
./cuda-install-samples-8.0.sh .

Within the subdirectoryNVIDIA_CUDA-8.0_Samples/3_Imaging/cudaDecodeGL is the file findgllib.mk. This file contains a hardcoded value of the Nvidia Driver on line 61, col 30 which should be changed from 367 to the driver version number you previously installed.

Compile the examples

cd ../.. && make

You can now run examples within NVIDIA_CUDA-8.0_Samples to your heart’s content. Two particularly useful scripts are found within NVIDIA_CUDA-8.0_Samples/bin/x86_64/linux/release./deviceQuery prints out the GPU in use, and ./bandwidthTest its bandwidth.


4. Install cuDNN 6.0.21

From the nvidia website, sign up to the developer program and agree to the terms. From the dropdown menu cuDNN v6.0.21 (April 27, 2017), for CUDA 8.0, download:

  1. cuDNN v6.0 Runtime Library for Ubuntu16.04 (Deb)
  2. cuDNN v6.0 Developer Library for Ubuntu16.04 (Deb)
  3. cuDNN v6.0 Code Samples and User Guide for Ubuntu16.04 (Deb)

.deb is preferred over .tar due to the format being more specific to Ubuntu, resulting in a cleaner install. The three packages are installed using the following commands:

sudo dpkg -i libcudnn6_6.0.21-1+cuda8.0_amd64.deb
sudo dpkg -i libcudnn6-dev_6.0.21-1+cuda8.0_amd64.deb
sudo dpkg -i libcudnn6-doc_6.0.21-1+cuda8.0_amd64.deb

Testing cuDNN

Copy the installed samples to a readable directory, then compile and run mnistCNN.

cp -r /usr/src/cudnn_samples_v6/ $HOME
cd $HOME/cudnn_samples_v6/mnistCUDNN
make clean && make
./mnistCUDNN

If all is well, the script should return Test passed!

Uninstalling cudnn

The following commands uninstall the three libraries. In addition, if you have created samples, then rm -r ~/cudnn_samples_v6 in addition.

sudo apt-get remove libcudnn6
sudo apt-get remove libcudnn6-dev
sudo apt-get remove libcudnn6-doc

5. Install pip 9.0.1

Pip updates itself very frequently, around once every fortnight. An up-to-date pip is recommended.

The following commands install and upgrade pip to the latest version.

sudo apt-get install python-pip python-dev
sudo pip install --upgrade pip

To verify, ensure pip -V prints out the version number.


6. Install Tensorflow 1.3.0

pip install tensorflow-gpu

To validate, start python $ python and ensure the following script prints Hello, TensorFlow!

import tensorflow as tf
hello = tf.constant('Hello, TensorFlow!')
sess = tf.Session()
print(sess.run(hello))

7. Install Theano 0.10

Theano requires the following system dependencies

sudo apt-get install libopenblas-dev cmake git

and the following Python dependencies.

sudo pip install numpy scipy nose sphinx pydot-ng pycuda scikit-cuda cython

libgpuarray enables Theano to use the GPU, which must be compiled from source. First download source code

git clone https://github.com/Theano/libgpuarray.git
cd libgpuarray

Compile into a folder called Build.

mkdir Build
cd Build
cmake .. -DCMAKE_BUILD_TYPE=Release
make
sudo make install

Then compile into a python package

cd ..
python setup.py build
sudo python setup.py install

Add the following line to ~/.bashrc such that Python can find the library.

export LD_LIBRARY_PATH=/usr/local/lib/:$LD_LIBRARY_PATH

Finally, install Theano

sudo pip install git+https://github.com/Theano/Theano.git#egg=Theano

To verify, first create a test file test_theano.py with its contents copied from here. Then ensure THEANO_FLAGS=device=cuda0 python test_theano.py succeeds with Used the gpu.


8. Install CNTK 2.2

sudo pip install https://cntk.ai/PythonWheel/GPU/cntk-2.2-cp27-cp27mu-linux_x86_64.whl

Verify that python -c "import cntk; print(cntk.__version__)" prints out 2.2.


9. Install Keras 2.0.8

sudo pip install keras

Verification, check that import keras from $ python succeeds.


10. Install PyTorch 0.2.0

PyTorch runs on two libraries, torchvision and torch, which can be thus installed.

sudo pip install http://download.pytorch.org/whl/cu80/torch-0.2.0.post3-cp27-cp27mu-manylinux1_x86_64.whl 
sudo pip install torchvision

To validate, the following script should print out a tensor with randomly initiated floats.

from __future__ import print_function
import torch
x = torch.Tensor(5, 3)
print(x)

Conclusion

By far the hardest part of the whole process was figuring out the dependencies between Nvidia Drivers and DL packages, and what the most effective long-term installation process was. The easiest part was installing Python packages, which were well-maintained and documented.

Although reading documentation and going through source code was time-consuming, it was very instructive in understanding how each package was built and functioned, and helped me gain an understanding of the whole Ubuntu ecosystem.

Finally, congratulations if you’ve made it this far, and thank you for reading!