Setting up your personal Deep Learning station with Ubuntu 16.04, CUDA 9.0, cuDNN 7.1.2 and tensorflow-gpu-1.7

Fabio M. Graetz
Apr 24, 2018 · 4 min read

I am a PhD candidate in theoretical astrophysics based in Berlin, Germany. I investigate the interaction between small embedded gravitational perturbers such as moons or proto planets in thin cold cosmic disks like planetary rings or proto planetary disks. My other passion is artificial intelligence and machine learning. Recently I built a computer dedicated to Deep Learning in order to be able to train deeper models than on my MacBook.

There are many very useful articles regarding hardware recommendations. I found the following particularly useful: Picking a gpu for deep learning and A Full Hardware Guide to Deep Learning.

I decided to buy a GTX 1080 Ti from EVGA. Titan Xp is slightly better but also significantly more expensive.

Given current GPU and RAM prices I considered buying a pre-built or used gaming PC. However many current gaming CPUs such as i7 8700k often found in these systems offer only 16 PCI Express lanes! This is a significant constraint as it allows you only to install a single GPU with 16 PCIe lanes. It was important to me to have the ability to install a second GPU later on. I therefore chose an i7 6850k which offers 40 PCIe lanes supporting two GPUs each connected to 16 lanes. I chose an ASRock X99 Taichi Motherboard, 32 gb of RAM and a M.2 SSD from Samsung.

After a few hours of trying to get tensorflow-gpu, theano, pytorch etc. running I figured out a pretty straightforward way to do it starting from a freshly installed Ubuntu 16.04. My procedure is a combination of the following tutorials (1,2) and the tensorflow documentation. Thanks a lot to the authors of those articles, you helped a lot. Opposed to these two articles the procedure presented in this article does not require configuring tensorflow from source, nor using “not officially supported” drivers.


Step 1:

I recommend starting with a fresh installation of ubuntu 16.04. First, make sure that everything is up to date

sudo apt-get update

sudo apt-get upgrade

Install git since we will use it later on:

sudo apt-get install git

Verify that your compatible GPU is found:

lspci | grep -i nvidia

If you don’t see any settings try update-pciids first.

Step 2: Installing CUDA

Download the nvidia cuda toolkit. I chose version 9.0 since 9.1 currently requires manually configuring tensorflow from source…

wget developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/cuda-repo-ubuntu1604_9.0.176–1_amd64.deb

… and proceed to install it:

sudo apt-key adv — fetch-keys http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/7fa2af80.pub

sudo dpkg -i cuda-repo-ubuntu1604_9.1.85–1_amd64.deb

sudo apt-get update

sudo apt-get install cuda-9.0

Reboot the system to load the drivers and add cuda to the path:

Add the following two lines at the end of the file ~/.bashrc

export PATH=/usr/local/cuda-9.0/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-9.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRAR$

And type the following into your terminal:

source ~/.bashrc
sudo ldconfig
nvidia-smi

Your output should include the version of the driver that is used (currently 390):

NVIDIA-SMI 390.30 Driver Version: 390.30

Step 3: Installing cuDNN

Create a free account on https://developer.nvidia.com/cudnn and select: Download cuDNN v7.1.2 (Mar 21, 2018), for CUDA 9.0

Download these three files:

cuDNN v7.1.2 Runtime Library for Ubuntu16.04 (Deb)

cuDNN v7.1.2 Developer Library for Ubuntu16.04 (Deb)

cuDNN v7.1.2 Code Samples and User Guide for Ubuntu16.04 (Deb)

Next, cd into the Download folder and type the following to install the packages:

sudo dpkg -i libcudnn7_7.1.2.21–1+cuda9.0_amd64.deb

sudo dpkg -i libcudnn7-dev_7.1.2.21–1+cuda9.0_amd64.deb

sudo dpkg -i libcudnn7-doc_7.1.2.21–1+cuda9.0_amd64.deb

Let us verify the installation of cuDNN

cp -r /usr/src/cudnn_samples_v7/ $HOME
cd ~/cudnn_samples_v7/mnistCUDNN/
make clean && make
./mnistCUDNN

If it says Test passed! you have CUDA and cuDNN successfully installed on your machine.

Step 4: Download conda and create an environment for Deep Learning

Next I will show you how to install Python and the gpu version of tensorflow.

First, download and install anaconda:

wget https://repo.continuum.io/archive/Anaconda3-5.0.1-Linux-x86_64.sh

bash Anaconda3–5.0.1-Linux-x86_64.sh

You have to agree to the license agreement and confirm the location of the installation. Next, upgrade conda:

conda upgrade -y — all

Important note: Medium.com shows the two minuses in front of “all” in a strange way. It is supposed to read “-y space minusminusall”.

Since I intend to use this machine for my work for PhD in theoretical astrophysics as well, i will create an environment for Deep Learning in order to keep things separated.

Before creating the environments, I install nb_conda_kernels so that i can later choose the respective kernel I need in the jupyter notebook.

conda install nb_conda_kernels

Create the environment with the following command and proceed to activate it:

conda create -n deeplearning pip python=3.6 ipykernel

source activate deeplearning

In the activated environment use pip to install the gpu version of tensorflow:

pip install https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.7.0-cp36-cp36m-linux_x86_64.whl

Let us check whether the installation was successfull by training a fully connected network on the mnist data set:

git clone https://github.com/tensorflow/tensorflow.git

python tensorflow/tensorflow/examples/tutorials/mnist/fully_connected_feed.py

The output should include a similar line…

Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10124 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:03:00.0, compute capability: 6.1)

…and the loss should be decreasing:

Step 0: loss = 2.30 (0.328 sec)
Step 100: loss = 2.16 (0.001 sec)
Step 200: loss = 1.94 (0.001 sec)
Step 300: loss = 1.58 (0.002 sec)
Step 400: loss = 1.32 (0.002 sec)
Step 500: loss = 1.01 (0.001 sec)

With tensorflow installed and the gpu working, we still have to install scikit-learn, pytorch, keras, scikit-learn, opencv and theano:

conda install scikit-learn

conda install pytorch torchvision cuda80 -c soumith

pip install keras

pip install opencv-contrib-python

conda install theano

When trying to import theano in python I got an error that i could solve by running conda install mkl-service.


I hope that I could help you set up your Deep Learning machine. If you need help at a certain point, leave me a comment :)

Have fun!

Fabio M. Graetz

Written by

Theoretical Astrophysicist | Machine Learning Engineer at Merantix | Bespoke Shoemaking | Berlin

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade