Alex Ragalie
26 min readNov 2, 2017

Build your own top-spec remote-access Machine Learning rig: a very detailed assembly and installation guide for a dual boot Ubuntu 16.04/Win 10 with CUDA 8 run on i7 6850K with 2x GTX 1080Ti GPUs

I will start by saying that there are already several very good (and recent) guides on how to install a Machine Learning rig from scratch, and I have used them heavily to set-up my own. See this one, this one, this one and this one. My sincere thanks goes to them for having taken the efforts to share their experience, as well as to everyone else in the space who is helpfully sharing their knowledge and working code.

The main reason I’m publishing this guide is because my specific configuration had some “quirks” during the install, which took many days to figure out and fix, so hopefully I’ll save some time for you in case you go with something similar. Based on the extensive research i’ve done before building it, this seems to be amongst the best money-to-benefit ratio rigs that you can buy on the top-end of the price range.

Here is the config:

  • CPU: Intel Core i7 6850K (LGA 2011-v3, 3.60GHz, Unlocked)
  • Motherboard: Asus X99 Deluxe II
  • RAM: Corsair DDR4 Vengeance LPX Black 32GB 2-Kit 2x 16GB 3200M
  • GPU: 2x ASUS GeForce GTX 1080 Ti STRIX
  • SSD: Samsung 960 EVO 1TB
  • PSU: Corsair HX1000
  • CPU Cooler: Cooler Master Hyper 212 Evo (will be later on replaced by a full CPU+GPUs watercooled setup from EK)
  • Case: Phanteks Enthoo Evolv ATX
  • Monitor: Acer Predator XB271HU (27", 2560x1440) connected to the main (first PCIE slot) GPU via one of the DP slots
  • Keyboard and mouse: normal USB ones

In the interest of flexibility and the occasional gaming session (this is also a top-notch gaming rig after all!) I chose to go with a dual boot Ubuntu 16.04 and Windows 10 setup. I initially wanted to go with Win8 instead of 10, but installing the drivers on the boot USB for the Samsung nVME SSD was an absolute nightmare (and it didn’t work at all in the end).

So here are the specific steps I took. I’m also sharing the links that were most helpful for me in getting things running. In case you get stuck or something doesn’t work, your best bet is to Google it, then Google it some more using slightly different keywords, and the Google it some more. I’m confident that you’ll get to a workable solution in the end.

1. Pre-Assembly preparation

Bootable UBS sticks

You will need to create bootable USB sticks (if you have only one, I suggest you first create it to boot-install Windows 10 and then Ubuntu, and leave Ubuntu on it as you might need it later on); in my case I had one for Win and one for Ubuntu. Here is a very good guide on how to go about it; it says Windows7 but the creation of a bootable USB stick is identical for both Win7 and Win10.

Antistatic bracelet

One key thing to make sure that you do before you start unpacking and assembling the hardware is to make sure that you’re using a method to avoid static electricity buildup. Make sure that you’re not doing the process on a carpet, and ideally buy yourself a simple and cheap antistatic bracelet. This is a key precaution, as otherwise you can easily render useless your components.

Obviously I didn’t prepare for this beforehand, so after spending 1h searching for different DYI solutions, I ended up taking a 2m unused 3mm headphone cable I had lying around, clipping one end to the grill of the PSU (which had to be plugged in the mains outlet and I also made sure it was turned off!!!) and then had the other end always touching my wrist via 2 simple rubber bands used as bracelets on my left wrist.

You should also ensure that you have a good multi-head screwdriver, ideally with magnetic inserts so that your small screws are kept in place when positioning/taking them off.

An undisturbed working environment with at least 1h timeslots of continuous focus

Especially if you have a large family with kids running around, it’s important to mention that you need to allocate proper time, especially for the assembly stages. Some of the key steps will require ~1h of your undivided attention, after which you can take a break and do something else.

To set your expectations, the whole assembly process took for me about 5h (I’ve not done this before), followed by about 3h to install the operating systems, followed by about 1 week of tinkering until all the software libraries were in place and working.

Prepare your working space

The whole operation will require a large table, on which you can leave things lying around for a few days undisturbed. You’ll also have multiple boxes that need to be stored until you get everything fixed.

I also recommend you do not throw away your packing boxes for a couple of weeks. This will ensure that, in case a component dies on your or is DOA (dead on arrival), it will be relatively straight forward to send it back.

2. Hardware Assembly

I suggest you follow the steps in the below order and first assemble stuff on the motherboard outside of the case, and then mount it inside the case with the 9 screws once everything is fixed; only after that put in the GPUs.

Note: I used the white cardboard protection the motherboard comes surrounded in (inside of the packaging box) as a working surface to lay the mobo on; this provided a semi-hard but also flexible surface on which I could push at will (and you’ll need to do some pushing, especially for screwing in the CPU cooler and for the RAM installation) without worrying that the back of the circuit board will be damaged.

CPU installation in the motherboard

A straightforward and simple process, with only one thing to watch-out for: the golden arrow on the CPU needs to match to the one engraved on the motherboard. One thing to also notice is that the 2 levers you need to open and then close require a degree of force as well, especially when closing it.

Follow these instructions and you should be ok.

CPU cooler installation

Important note here: I purposefully chose a cheap and easy to install/uninstall cooler, as I plan later on to have a full-cycle water-cooled system in place. I wanted none to less to have a working solution in place, in order to test all the components and ensure that both my processor and my GPUs were fine.

One other thing to note here: the level of noise from this machine is significant! When the GPU fans are running full-throttle it sounds like a small jet taking off, so you need to plan for this aspect. This is another reason why I’ll be switching to water-cooling, as the noise level + GPU performance (via overclocking) can be noticeably improved.

In terms of installing the cooler, it was slightly tricky, but nothing to worry about. My metal brackets came set for an AMD processor, so I just had to move them in the right position before they would fit on the slots for the Intel LGA 2011. In terms of steps, I recommend that you first do a dry-run and ensure that all the screw slots are matching and only then apply the thermal paste and start screwing it in.

And remember to take the sticker off from the bottom of the mounting slab!

Here is the guide which I found most helpful

And a few others guides:

Cooler Master Hyper 212 EVO fan installation on INTEL 2011 plus CPU and RAM install — Part 5

Cooler Master Hyper 212 EVO — Install and Unbox | LGA 2011!

SSD installation

I’m amazed at the size of SSDs these days! This small thing needs to go vertically into the motherboard, so here’s a very helpful and fully accurate guide for how to do it:

Vertical M.2 SSD installation (Crazy CPU Machine 3/10)

RAM installation

This is very straight forward as well, with 2 main mentions: first make sure that the unlocking tab on the right side of the RAM slot is unlocked (cocked to the right side) for each of the RAM slots before you try installing. Second, you will need to push relatively hard to slot the RAM in (first onthe left side, and then on the right side to click into place). Push too hard and you break things though, so be mindful.

Here’s a guide : Ram installation on a X99, quad channel, motherboard (Crazy CPU Machine 4/10)

Case unpacking and setup

I opened up the case, took off the front and the back plates, as well as the top one, to make sure that I had unhindered access to all sides. The whole thing will get quite heavy as you install things in, so my reco is to set-up things in such a way that you’ll have always a 360 degrees access around your case, avoiding the need to move it around for the different cable setup.

PSU installation and power cables preparation

This step is a bit challenging, especially if you’ve never done this before. It took me about 2h of doing/redoing to ensure that I had everything set-up properly, and it was still a heart wrenching moment when I powered it on :)

I suggest you do several dry-runs of setting up the cables and bringing them on the front side to see that everything fits nicely, and also that you watch the video multiple times to understand what needs to go where…

Here is a tutorial: PSU installation & Show off video :) (Crazy CPU Machine 10/10)

And this one is helpful as well: Connectors & cabling (Crazy CPU Machine 8/10)

Some things which I would mention :

  • Each of the GPUs takes in 2 separate PCI cables, for each of the power slots available. This means that you must have 4 separate cables routed to the place where the 2 GPUs will be sitting.
  • I also decided to add the second power cable to the board for CPU overclocking. You must ensure beforehand that you’re selecting the proper splitting cable (as explained in the tutorial) for the job
  • Linked to the above point: there’s a slight but noticeable difference (once you know where to look) between the “normal” PCI cable and the CPU one, not only in the way it splits but also in the way it slots into the connectors. Make note of that and make sure you don’t force anything
  • Speaking of forcing cables in: the biggest pain to put in was the main Motherboard power cable. Due to the way in which my HX1000 cables came threaded together inside their “fancy” black housings (vs. flat as you see in the video), I had to do some significant bending and pushing/adjusting in order to have the cable slot into the motherboard and at the same time come to the back of the case in the place nearest to the motherboard. It’s hard to understand now, but you’ll see what I mean when you’ll have to do it. The same was the case for connecting the power cables to the GPUS, as they’re so closely stuck together..

I suggest you do at this stage a quick power-on only for the motherboard, to ensure that it’s not DOA. Make sure it’s not touching anything metal and also make sure that there’s nothing that can fly into the fan of the CPU. Then connect the motherboard to the PSU via the main power cable and press the power button. Wait until the AA message shows on the LCD and then power it off.

Install the Motherboard into the case

Very simple, and you need just the 9 screws to do it. I initially didn’t know which are the proper screws (they come in the transparent plastic partitioned box inside the case packaging), so I went with the ones which have a slightly rounded head with a flat base.

My advice for this step would be to lay the case flat on its side and mount it in that way. Also, as the motherboard will have the CPU cooler already mounted, make sure you set it in it’s place on the case mounts very gently, ensuring that you’ve first mounted the front panel overlay (the “face” of the motherboard in the back of your computer) into the back panel of the case.

Here is a tutorial for this stage: x99 Motherboard installation (Crazy CPU Machine 2/10)

Various other connections to the motherboard:

You will need to connect the CPU fan in the proper slot on the motherboard (the white one, as there’s a black next to it as well). In the black one you need to connect the case fans, with the proper black cable coming out of the fan bridge on the back of the case.

As well you’ll need to put in the RGB leds + the power on/Reset button connector of the case in the bridge provided with the motherboard (a small black thing with spikes on one side and text written on it).

Here is a guide for this part: Connectors & cabling (Crazy CPU Machine 8/10)

GPUs installation

Remember to take out the protective black covers over the PCI bridges of each GPU before slotting into the motherboard.

The watch-out here is that you need to be mindful of the fact that the GPUs are very heavy, so I recommend you install them also with the case lying flat. As well, notice there will be very little space between the 2 GPUS, which makes things a bit complicated as well. Start with the upper one and move to the second one, and you should be fine.

Tutorial: Video Card & ThunderboltEX 3 (Crazy CPU Machine 9/10)

Note: don’t even try to mount the Thuderbolt 3 card as per this video, because it won’t fit between the 2 GPUs

In terms of slots, I installed the first on slot 1 and the second on slot 3, as instructed by the manual. This ensures that you’ll be able to run both of them at x16 speed (highest available).

In case you ever need to take the GPUs out of the motherboard, make sure that before you pull on the GPU you first press the locking tab on the motherboard so it unlocks! This would be a VEEEEEERY tricking thing to do for the second GPU, as you literally have only a couple of centimeters of space available. I recommend you DO NOT use a screwdriver, because if your hand slips then you’ll stick it directly into the motherboard! What I did was to take a long-handled painting brush (which has a rounded tail end), and use that to press the tab. Just to get a feel for where you need to press to unlock it and with what force, I suggest you test it out on the upper GPU as with that there’s enough space.

Here’s a video to show you what I mean (didn’t find one that shows the X99 mobo though, but it’s the same concept): How to remove GPU

Install as well the SLI bridge that connects the 2 GPUS. In the X99 box there are 2 SLI bridges available, and I went with the nice-er looking one. FYI, the SLI bridge is only useful for gaming and nothing else, as CUDA uses only the PCIE interface due to the speed.

Cabling and power-on

Connect all the power and connection cables which are left (GPUs + anything else you might have), and then power it on! One note though: do not connect yet your monitor to the GPU via the DP cable (neither the keyboard or mouse), as we’ll do that later on.

Here is a again the same tutorial: PSU installation & Show off video :) (Crazy CPU Machine 10/10)

And this one is helpful as well: Connectors & cabling (Crazy CPU Machine 8/10)

You need to wait for the motherboard to do it’s thing, and then you’ll see “AA” as a message (on the LED display next to the power-on button located on the mobo) which means you’re good to go!

I suggest that at this stage you put back the front and back panel (and the top one if you’ve removed it as well) and enjoy the blinking lights :)

WHEWWWW! That was quite some work, so give yourself a big pat on the back.

Now onward to the fun stuff.

3. Installing Windows10 and Ubuntu 16.04

Once you’re sure that your motherboard is working, connect your main GPU (the upper one) to the monitor and then plugin your keyboard and mouse. Also connect the WIFI antenna with it’s 3 golden connectors and/or the LAN cable in case you have a wired connection.

A few things to set your expectations for this stage:

  • There will be MANY restarts and reboots involved
  • Hopefully for you, following this guide, things will work better the first time; for me it was a very frustrating process, involving many hours of googling and trying out stuff
  • My advice is to be patient and realize that you’re not the only one who has these issues, and everything generally has a solution
  • Keep in mind that most of the issues you will be facing at this stage of the installation can be equally due to the Win/Ubuntu as much as due to the BIOS of the motherboard. If you’ve nerver done this before, it will be a good intro into the wonderful world of UEFI and booting challenges :)

Source and more info: https://www.lifewire.com/ultimate-windows-7-ubuntu-linux-dual-boot-guide-2200653

Initial BIOS configuration

The first time you start your computer you’ll get the BIOS screen. Things have come a very long way since the last time I saw a BIOS screen in the early 2000’s, so things are quite simple.

A few key things you must do:

  • Update your BIOS via the internal tool (Advanced->Tool->ASUS EZ Flash Utility-> select by Internet) ; it was straightforward for me, as it connects on its own to the Internet and downloads everything needed; do this step first as everything resets after update
  • Enable the XMP option, to get your RAM working at the full speed (main screen, middle left side)
  • Disable Fast boot (in the Boot menu)
  • Set your to OS Type to Windows UEFI mode (in the Boot/Secure Boot submenu); here’s a link with explanation https://www.technorms.com/45538/disable-enable-secure-boot-asus-motherboard-uefi-bios-utility; if you ever want/need to disable Secure Boot entirely, use this link as well as it has a very good explanation
  • Bear in mind that the above setting needs to be changed to Other OS when we’ll install and configure Ubuntu, and it should be kept on this setting even afterwards

Windows 10 installation

Everyone suggests to start with Win instead of Ubuntu, and I think it’s a sensible suggestion. Make sure that you have enabled “Windows UEFI mode” in the BIOS before inserting the bootable USB stick.

A few notes:

  • On my 1TB SSD I have set-up the following partitions:
  • Windows: C drive 250GB
  • Windows: All other system partitions it creates on its own (which you can’t change anyways)
  • Ubuntu: /home partition (ideally I should have made this at least 150GB as well, but I didn’t know and made it 50Gb only)
  • Ubuntu:/swap partition (40GB)
  • Ubuntu: /root (50GB)
  • Ubuntu: /data ext4 partition (600GB)
  • Note on NTFS communication between Ubuntu and Windows: Ubuntu can read the data in Windows whereas vice-versa it doesn’t work; initially I created the /data partition as NTFS in windows, but for the life of me couldn’t manage to mount it properly in Ubuntu due to permission issues. So my suggestion, in case you want the two OSs to have a shared partition, is to make it FAT32 and for a small enough size so that it doesn’t waste the space, also understanding that FAT32 cannot manage single files larger than 4gb. I chose to avoid this entire hassle and transfer data between then via another method if needed (e.g external disk).
  • Make sure that you never write onto the system Windows partitions from Ubuntu!! This has high chances of breaking your Windows partitions and losing your data/requiring a reinstall. Keep them separate and they play nice, combine things and you start having a mess….at least that’s my own experience
  • In order for SLI to work in Windows, you need to enable it in the settings. Instructions here: https://www.computerhope.com/issues/ch001167.htm

I suggest you don’t spend too much time installing everything in terms of software for Windows at this stage; there will be time for it later on. For now just make sure that everything works in terms of drivers, especially for the graphics cards. To make sure that the GPUs are ok and performing up to speed, I installed 3DMark Demo and ran a couple of benchmarks.

Once Windows is set-up and working properly, time to enjoy the fun of Ubuntu!

Ubuntu 16.04 installation

Make sure that you have enabled “Other OS” in the BIOS->Boot->Secure Boot menu, before inserting the bootable USB stick.

  • Insert the USB stick
  • Start the computer, press Del or F10
  • On the right side, go to “Boot Menu” or press F8
  • Select the USB stick and your computer will restart
  • BEFORE you click on “Try Ubuntu” to start installation, press “e” on your keyboard and edit “quiet splash” text to nomodeset, otherwise the system does not start (it’s missing the display drivers). If for whatever reason “nomodset” doesn’t work, add “nouveau.modeset=0” after quiet splash
  • Then the installation flow of Ubuntu begins
  • Follow the tutorial here: https://www.lifewire.com/ultimate-windows-7-ubuntu-linux-dual-boot-guide-2200653
  • I’ve provided more info on my setup in the Windows section in terms of my own partition choices

And this is where the fun really starts! The whole challenge revolves around the NVIDIA drivers, which for some reason seem to be a very debated and buggy topic on Ubuntu. Here is what you need to know before we go into the process:

  • By default, Ubuntu will not recognize the GPU and due to this, it will revert to “nouveau”, the generic NVIDIA driver, which for some reason fails to allow Ubuntu to start
  • The work-around for this is the “nomodset” startup flag, but in some cases even that doesn’t work
  • Even after you install, your drivers will need a lot of work to get installed, and it’s not an easy process
  • Neither your LAN nor your wireless will be working directly when you get into Ubuntu (neither for the install -though it’s not needed- nor afterwards on the clean OS); to get around this the best option I found is via manual setup of the connection, which is another fun process in itself
  • After you get the driver installed, the process to install CUDA and cuDNN is also very involved, so prepare for many steps
  • All of your work will happen via CLI (command line interface) and not via the visual interface of Ubuntu; this makes everything much faster and more straightforward
  • In case you’re getting errors during the OS install, and sometimes even afterwards — especially for drivers install — one of the main culprits is the SecureBoot/UEFI thing; so make sure that you google properly and play around with this BIOS setting in case nothing else works

Now, after you’ve installed Ubuntu, my first reco is to change the Boot order in the BIOS (Boot menu option) to have the ubuntu partition boot first.

Then set the wait time for GRUB to only 2 seconds (to make us wait much less after each reboot)

Run this in your terminal:

sudo -H gedit /etc/default/grub

This will open up the grub configuration file. Look for a line similar to GRUB_TIMEOUT=2 (Here I have set the grub time to 2).

Change the number to 2 or whatever time you desire (in seconds).

Then Save the file (Ctrl+x then Y) and run

sudo update-grub

Going forward, I’m only going to focus on how to install everything you need in Ubuntu, as this will be the main OS we’ll be using for Machine Learning. As mentioned in the beginning, Windows is there just in case we need it for other things. This is not mean to say that you can’t use Windows for ML, you certainly can, but it’s not in scope of this article.

4. Installing the basics in Ubuntu 16.04

Disable graphical interface

Tutorial: https://askubuntu.com/questions/16371/how-do-i-disable-x-at-boot-time-so-that-the-system-boots-in-text-mode/79682#79682

Run this in your terminal:

sudo nano /etc/default/grub

Find this line:

GRUB_CMDLINE_LINUX_DEFAULT=”quiet splash”

Change it to:

GRUB_CMDLINE_LINUX_DEFAULT=”text”

Update GRUB:

sudo update-grub

sudo systemctl enable multi-user.target — force
sudo systemctl set-default multi-user.target

Note: You will still be able to use X by typing startx after you logged in. If you want it, here are the steps:

Run this in your terminal:

startx

After the interface starts, you’ll see just the background and an X as the mouse pointer. Right click the mount and select “Open Terminal”, then type in:

/etc/init.d/lightdm start

If at boot time there are issues (nouveau) due to display driver after you disable the graphical interface, press “e” in GRUB and edit “text” option to nomodeset, same as we did at installation time

Get the wireless/LAN connection working

In case you want to use the Network-Manager for configuration, follow the steps at this link https://askubuntu.com/questions/55868/installing-broadcom-wireless-drivers

This for me was very hit and miss though (especially when we’ll set-up the static IP for remote connection), so I ended up going with the proven manual connection setup.

Links below:

https://unix.stackexchange.com/questions/253030/how-to-setup-network-without-wicd-or-networkmanager

https://askubuntu.com/questions/464507/ubuntu-14-04-server-wifi-wpa2-personal/464552#464552

Get Ubuntu up to date and install tmux

Here’s the source: https://github.com/fastai/courses/blob/master/setup/install-gpu.sh

sudo apt-get update

sudo apt-get — assume-yes upgrade

sudo apt-get — assume-yes install tmux build-essential gcc g++ make binutils

sudo apt-get — assume-yes install software-properties-common

sudo apt-get — assume-yes install git

Install Ubuntu System Dependencies

sudo apt-get install build-essential cmake git unzip pkg-config

sudo apt-get install libjpeg-dev libtiff5-dev libjasper-dev libpng12-dev

sudo apt-get install libavcodec-dev libavformat-dev libswscale-dev libv4l-dev

sudo apt-get install libxvidcore-dev libx264-dev

sudo apt-get install libgtk-3-dev

sudo apt-get install libhdf5-serial-dev graphviz

sudo apt-get install libopenblas-dev libatlas-base-dev gfortran

sudo apt-get install python-tk python3-tk python-imaging-tk

Install both Python 2.7 and Python 3 header files so that we can compile OpenCV with Python bindings

sudo apt-get install python2.7-dev python3-dev

Prepare our system to swap out the default drivers with NVIDIA CUDA drivers

sudo apt-get install linux-image-generic linux-image-extra-virtual

sudo apt-get install linux-source linux-headers-generic

Ensure .bash_profile takes information from .bashrc

Several of the settings we’ll do later on for the python installs require you to add paths to .bashrc. But due to how the user access configuration is done on Ubuntu, those changes must be replicated as well on the user’s .bash_profile in order to be taken in every time the computer reboots.

sudo nano .bash_profile

Add this to the file:

if [ -f ~/.bashrc ]; then
source ~/.bashrc
fi

5. Installing CUDA 8

Source tutorial: https://www.pyimagesearch.com/2017/09/27/setting-up-ubuntu-16-04-cuda-gpu-for-deep-learning-with-python/

First disable the Nouveau kernel driver by creating a new file

sudo nano /etc/modprobe.d/blacklist-nouveau.conf

Add the following lines and then save and exit:

blacklist nouveau

blacklist lbm-nouveau

options nouveau modeset=0

alias nouveau off

alias lbm-nouveau off

Next let’s update the initial RAM filesystem and reboot the machine

echo options nouveau modeset=0 | sudo tee -a /etc/modprobe.d/nouveau-kms.conf

sudo update-initramfs -u

sudo reboot

You will want to download the CUDA Toolkit v8.0 via the NVIDIA CUDA Toolkit website https://developer.nvidia.com/cuda-80-ga2-download-archive

From there, download the -run file which should have the filename cuda_8.0.61_375.26_linux-run or similar. To do this, simply right-click to copy the download link and use wget on your remote GPU box:

wget https://developer.nvidia.com/compute/cuda/8.0/Prod2/local_installers/cuda_8.0.61_375.26_linux-run

From there, unpack the -run file

chmod +x cuda_8.0.61_375.26_linux-run

mkdir installers

sudo ./cuda_8.0.61_375.26_linux-run -extract=`pwd`/installers

NOTE: we will be taking a small detour at this step from the tutorial in the initial link, as the driver there did not work for me. So I needed to install another, not “officially supported” driver, but which worked perfectly.

Install the NVIDIA kernel driver (using a special driver for GTX 1080Ti)

Source: https://blog.nelsonliu.me/2017/04/29/installing-and-updating-gtx-1080-ti-cuda-drivers-on-ubuntu/

Add the PPA to apt-get and update the index by running

sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt-get update

Now, we use it to install the desired driver versions (as of this writing) for 1080 Ti: Major version 381

sudo apt-get install nvidia-381

If in the future you want to update the drivers via the apt-get drivers

First, remove the old drivers:

sudo apt-get purge nvidia*

Now, just install the new driver with the PPA as detailed above and reboot.

Check that the driver and CUDA were installed

nvcc — version # Checks CUDA version

nvidia-smi # Info about the detected GPUs

Install the CUDA Toolkit and examples

sudo ./cuda-linux64-rel-8.0.61–21551265.run

sudo ./cuda-samples-linux-8.0.61–21551265.run

Update your ~/.bashrc

sudo nano ~/.bashrc

with the below text (which should be pasted inside the file):

# NVIDIA CUDA Toolkit

export PATH=/usr/local/cuda-8.0/bin:$PATH

export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64/

Now, reload your ~/.bashrc ( source ~/.bashrc ) and then test the CUDA Toolkit installation by compiling the deviceQuery example program and running it:

source ~/.bashrc

cd /usr/local/cuda-8.0/samples/1_Utilities/deviceQuery

sudo make

./deviceQuery

you should see “Result = PASS”. In case you did not, go back and redo all the steps of chapter 5.

6. Install cuDNN (CUDA Deep Learning Neural Network library)

For this step, you will need to Create a free account with NVIDIA and download cuDNN. For this tutorial I used cuDNN v6.0 for Linux which is what TensorFlow requires.

Next, untar the file and then copy the resulting files into lib64 and include respectively, using the -P switch to preserve sym-links:

cd ~

tar -zxf cudnn-8.0-linux-x64-v6.0.tgz

cd cuda

sudo cp -P lib64/* /usr/local/cuda/lib64/

sudo cp -P include/* /usr/local/cuda/include/

cd ~

And that’s it for cuDNN, you should be set.

7. Setup remote connection

This section will be about the steps you need to take in order to connect remotely to your machine via a slim terminal, in my case a Macbook Pro. This provides an amazing level of freedom and flexibility, especially because you don’t need to be next to your machine for (almost) anything. Even startup of the machine can be done remotely via WOL (wake on lan).

Configure static a static IP for your machine

Below are a couple of very good tutorials, so I won’t go into details

https://michael.mckinnon.id.au/2016/05/05/configuring-ubuntu-16-04-static-ip-address/

http://www.configserverfirewall.com/ubuntu-linux/ubuntu-set-static-ip-address/

Port forwarding

This steps varies significantly depending on your own home/office network setups, but here are the principles:

  • In order to access your machine, it needs to have a static IP, and the router should make sure not to overwrite it (usually done via manual exception designation in the DHCP server settings of the router)
  • For the remote connection to reach the machine’s static IP, the router must be configured to do Port forwarding, in essence to take any request coming in from outside and send it to the specific port on the specific machine
  • For the remote connection to reach the router in the first place, you need to make sure that the modem is also configured to allow for Port forwarding

So knowing the steps above, start using your google fu and figure out how to do this for your own setup.

Configure SSH on both the ML rig as well as the slim client (Macbook Pro)

This topic is again very straightforward and fully documented on the web, so here are some useful links

https://askubuntu.com/questions/464507/ubuntu-14-04-server-wifi-wpa2-personal/464552#464552

http://ubuntuhandbook.org/index.php/2016/04/enable-ssh-ubuntu-16-04-lts/

https://www.digitalocean.com/community/tutorials/how-to-use-ssh-to-connect-to-a-remote-server-in-ubuntu

https://www.digitalocean.com/community/tutorials/how-to-set-up-ssh-keys--2

Prevent the SSH session from freezing on the Mac

Once you’ll start using SSH, you might notice that your connection to the ML rig keeps breaking on the Mac. This is due to the standard timeout setttings on both machines, which can be easily configured to keep the connection alive:

On the Mac/

sudo nano /etc/ssh/ssh_config

Then add to the file:

Host *
ServerAliveInterval 100

On the server:

sudo nano /etc/ssh/sshd_config

Then add to the file:

ClientAliveInterval 60
TCPKeepAlive yes
ClientAliveCountMax 10000

From this point onwards you don’t need to be next to your rig anymore, and neither does it need to be plugged in to either a screen or the keyboard + mouse. Barring any specific errors or boot issues, it should be fine on its own with just the power connected and the wireless/LAN working, with you controlling it via SSH and terminal.

8. Install Python3

Everything following below can be done in several ways, and it all depends on your requirements, desires and environment configuration. I’ll just show you what I did to get a barebones installation that can do “most” of the tasks required for Machine Learning, at least at a basic level.

Install Anaconda3

wget https://repo.continuum.io/archive/Anaconda3-5.0.1-Linux-x86_64.sh

bash Anaconda3–5.0.1-Linux-x86_64.sh

source .bashrc

source .bash_profile

conda upgrade -y — all

Install Tensorflow

conda install -c conda-forge tensorflow

sudo apt install python3-pip

pip install tensorflow-gpu

Validate Tensorflow Installation

git clone https://github.com/tensorflow/tensorflow.git

python tensorflow/tensorflow/examples/tutorials/mnist/fully_connected_feed.py

You should see the loss decreasing during training:

Step 0: loss = 2.32 (0.139 sec)

Step 100: loss = 2.19 (0.001 sec)

Step 200: loss = 1.87 (0.001 sec)

Install PyTorch

conda install pytorch torchvision cuda80 -c soumith

Install Theano

conda install theano

Install Keras

pip install keras

Install openCV

Simple way to do it:

conda install -c conda-forge opencv=3.2.0

Now, in case that doesn’t work , google it a bit as there are several other variants:

pip install opencv-contrib-python

or

conda install — channel https://conda.anaconda.org/menpo opencv3

Very complicated (manual) way to do it:

Source: https://medium.com/@debugvn/installing-opencv-3-3-0-on-ubuntu-16-04-lts-7db376f93961

9. Configure JupyterNB to work remotely

Configure it

# Create a ~/.jupyter/jupyter_notebook_config.py with settings

jupyter notebook — generate-config

jupyter notebook — port=8888 — NotebookApp.token=’’ # Start it

Create a SSH tunnel for Jupyer

Create either script that runs this on the Macbook, or simply run this every time you want to connect on the Macbook to your remote ML rig

ssh -N -f -L localhost:8888:localhost:8888 -p xxxx (your remote port) xxxx.xxxx.xxx.xxx (your remote IP)

Now you can visit http://localhost:8888 in your laptop’s browser and start editing the notebooks on your deep learning machine!

In case the above command responds that the 8888 port is already in use (from a previous session), do the below to kill the process and then rerun the above

lsof -ti:8888 | xargs kill -9

Run Jupyter on boot for the ML rig

Create a startup_ipynb.sh file

cd ~

sudo nano startup_ipynb.sh

Add the below in the file:

export PATH=/home/xxx(your username)/anaconda3/bin:$PATH

jupyter notebook — config=/home/xxx(your username)/.jupyter/jupyter_notebook_config.py — no-browser

Open the rc.local file

sudo nano /etc/rc.local

Add this line:

su xxx(your username) -c ‘bash /home/xxx(your username)/startup_ipynb.sh’

Now every time you boot the machine, Jupyter Notebook should be running in the backgorund. So just go to your slim client and do:

ssh -N -f -L localhost:8888:localhost:8888 -p xxxx (your remote port) xxxx.xxxx.xxx.xxx (your remote IP)

Same as for the previous step: you can visit http://localhost:8888 in your laptop’s browser and start editing the notebooks directly on your deep learning machine!

In case the above command responds that the 8888 port is already in use (from a previous session), do the below to kill the process.

lsof -ti:8888 | xargs kill -9

ssh -N -f -L localhost:8888:localhost:8888 -p xxxx (your remote port) xxxx.xxxx.xxx.xxx (your remote IP)

10. Setup PyCharm to work remotely

If you use PyCharm, then here is a guide to help you setup it up, allowing you to have the full benefits of the PyCharm IDE but running the actual code remotely on the ML rig

And that’s it in terms of setting things up. Hope you’ve enjoyed the process and everything is now working for you.

(for ML beginners) fast.ai setup

In case you are working your way through the lessons at fast.ai (which is an outstanding learning resource btw), here are some things you could do to get things working without too much hassle.

Create a separate conda environment for python2.7

As of the time i’m writing this article (November 2017), most of the course was written by Jeremy in python 2.7, so the initial setup you’ve done if you’ve followed this guide (Python3) will not work. The best approach in this case would be to create a separate environment where you can run the lessons without issues.

Thanks to the ease of using conda + the kernel system of Jupyter Notebook this is very easy:

Allow the use of kernels, in case this is not already installed:

conda install nb_conda_kernels

Create a new python2.7 environment in conda, together with the required Jupyter NB kernel; “py27” is the name of the environment, and you can change that to whatever you like:

conda create -n py27 python=2.7 ipykernel

Restart your terminal and you should not see the new kernel showing in Jupyter Notebook (using the Kernel menu)

Installing the required libraries

Active the python2.7 environment, in order to install the required libraries inside:

source activate py27

Make sure that you are inside this environment (its name shows between brackets in front of your username in terminal) before proceeding to the next steps.

Install scikit-learn:

conda install scikit-learn

Install bcolz:

conda install -y bcolz

Install theano:

pip install theano

Install keras dependecies (you might need to install a few others; in case you get errors then Google it and stackoverflow will tell you what else might me missing):

pip install numpy

pip install pillow

pip install h5py

Install keras (version 1.2.2, as the newer ones 2.x do not work well due to the API changes which break the code in Jeremy’s notebooks)

pip install keras==1.2.2

mkdir ~/.keras

echo ‘{ “image_dim_ordering”: “th”, “epsilon”: 1e-07, “floatx”: “float32”, “backend”: “theano”}’ > ~/.keras/keras.json

You might be required to install a few other libraries as well (you’ll see the prompts in JupyterNB) depending on your setup, but it should be straightforward.

In case you get stuck on some specific errors, do a Google search with fast.ai at the end of your error and you will most likely find an answer on the forums there.

One other side note: considering that there is still a large amount of python2.7 code in the field of ML, i recommend to use either the environment you just created (py27) or even a separate one (perhaps with the most updated versions of all the libraries) in order to be able to run the code properly. In my experience, trying to change 2.7 code to run in python3 can be quite a time consuming (but extremely educative as well!) process, so if you’re pressed by time then better to use a dedicated env for it.

Let the Machine Learning begin!