AMD Ryzen-based deep learning server build
Since I haven’t seen all that many AMD Ryzen-based builds (as compared to Intel-based builds) for deep learning, I thought I should contribute to the community with my experience with such a build. My motivation to a get my own deep learning server was similar to most deep learning / fast.ai enthusiasts who have posted their builds: I started off with Amazon AWS, which was great until I started getting serious with Kaggle and started to chalk up a monthly bill of around €140 (including the occasional “forgot-to-power-down EC2-instance”). I decided to go the AMD way, since the processor-cooler combo turned out to be slightly more affordable than the Intel case. My goal was to build a server that I could access from my MacBook Air from “anywhere with internet”. In what follows I will describe the steps needed for the build followed by a basic (accessible from the home network) SSH server set up. Ensuring accessibility with my laptop from “anywhere with internet” turned out to be quite challenging since my router is behind a NAT server which gives me only an ipv6 address. I will therefore, reserve that topic for another article, once I have tried out a couple of things myself.
The steps involved were similar to most build blogs, with one additional one for testing the parts before installing it in the case:
- Select the components
- Test the components
- Set up the hardware
- Install the OS (Ubuntu 18.04)
- Install the drivers and libraries for deep learning
- Set up remote access
Select the components
Tim Dettmer’s blog was really helpful in getting a feel of the “GPU-scene” and in helping to decide how to choose the rest of the components. The points that stuck in my head were:
- Decide your GPU requirements first and allocate most of your budget to this. The GPU speed determines your training time.
- The CPU is NOT the real workhorse here so you don’t necessarily need the fastest one. You don’t want to underestimate it either since its doing the data augmentation calculations which contributes to training time too.
- RAM is important! Get as much as you can.
I used PCPartPicker to check for compatibility of the components.
My part list: https://de.pcpartpicker.com/list/dbhQBb
CPU: AMD Ryzen 7 2700 3.2GHz 8-Core Processor
Motherboard: Asus — Prime X470-Pro ATX AM4 Motherboard
Memory: G.Skill — Ripjaws V Series 32GB (2 x 16GB) DDR4–2133 Memory
Storage: Samsung — 970 Evo 500GB M.2–2280 Solid State Drive
Video Card: Asus — GeForce GTX 1080 8GB TURBO Video Card
Case: Corsair — 200R ATX Mid Tower Case
Power Supply: SeaSonic — FOCUS Plus Platinum 650W 80+ Platinum Certified Fully-Modular ATX Power Supply
I bought most of my components from Mindfactory, except for the Nvidia 1080 GPU from Asus which was only available on Amazon at that time and the CPU from K&M Computer (a popular chain here in Germany).
I used my TV as a display and had a keyboard and mouse lying about at home which I needed to install the software. I would recommend buying some thermal compound just in case it is needed during the build.
Testing the components
The parts took a few days to arrive and by the time I got all of them I was obviously itching to get started. Before getting started with the full-fledged build, I decided to do a basic build without the case, to make sure that the components are essentially working fine.
I found this video by JayzTwoCents really helpful. Here are the steps:
- Fix the CPU in the motherboard.
- Fix the RAM (I suggest fixing one stick only, to test the bare minimum only at this point). You should hear the click.
- Install the CPU cooler. This can be tricky because it requires a surprisingly large force to screw it in! Its not quite an IKEA build. Take your time before you get started with this step because you ideally don’t want to do this again because of the thermal paste etc. I guess there are two possible orientations for the fan. I used the one with the AMD logo away from the RAM slot. Think about how you want the wire from the fan to go to its power supply. Should you need more than one attempt at this, you will need to re-apply thermal compound. I used the grain-of-rice method.
- Mount the graphic card. Again, wait for the “click” sound
- Connect the power cables.
- Switch on the power supply (mains and the PSU).
- Switch on the motherboard. I needed to short the appropriate pins on it. There was no power button.
At this point, the CPU cooler fan started spinning, the LEDs on the motherboard came on, and finally the American Megatrend screen with the correct hardware information allowed me to to heave a sigh of relief.
I then disassembled everything except the CPU and the CPU cooler fan and got ready for the final assembly.
Setting up the hardware
I followed the motherboard instructions and the Corsair case instructions for the final set up so I’m not going to repeat those steps here. The important thing to note is that you want to fix the Motherboard+CPU+Cooler unit (from your test build above) in the case first, and only then fix in the rest. It should all be quite easy and you will be able to do this with a lot more confidence now that you know, from the test-build above, that the parts basically work.
Installing the OS (Ubuntu 18.04)
To do this, one needs a bootable USB stick that can be created by downloading the disk image available from the Ubuntu downloads page. I used the desktop version.
Go to the BIOS (by pressing the F2 or DEL key when rebooting) and navigate to the options which allow you to boot from your bootable USB stick. It was a breeze from there. In less than 10 minutes I got to the friendly Ubuntu start screen.
To get things up-to-date, type the following in the terminal:
sudo apt-get update
sudo apt-get upgrade
Install GPU drivers and CUDA
There seemed to be so much on Ubuntu 16.04 but hardly anything on Ubuntu 18.04. Finally this post came to the rescue. This is all that I needed:
sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt-get update
sudo ubuntu-drivers autoinstall
Then reboot, and once that’s done, go back to the terminal and type:
sudo apt install nvidia-cuda-toolkit gcc-6
I used the TAR file method as described in the SDK documentation. The essential steps followed are:
- Download the TAR file from Nvidia,
(Note: I went to the “Archived cuDNN Releases” to find the correct version corresponding to your CUDA version. I think I had CUDA version 9.1 which meant that I needed cuDNN v7.1.3)
- Unpack the tarball:
tar -xzvf cudnn-9.0-linux-x64-v7.tgz
- Copy those files to the right CUDA directory and change the permissions
sudo cp cuda/include/cudnn.h /usr/local/cuda/include
sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64
sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*
I followed the 5-step installation instructions in the documentation and it worked just as described. I liked that it inculded a way to test the installation:
Install the fastai library
I followed the conda installation instructions (for a GPU) in the fastai README and, as expected, in true fastai-style, it worked like a charm:
conda install -c pytorch pytorch-nightly cuda92
conda install -c fastai torchvision-nightly
conda install -c fastai fastai
I ran this command to install the OpenSSH server:
sudo apt install openssh-server
To configure the behaviour of the OpenSSH server, I used the SSH server documentation. I believe its best to use authentication with keys but without a password. There’s obviously a lot more one can customise and/or make secure.
Once you have the keys copied in the right places, you will be able to log in to your server from your home network with:
One way to get
<your_servers_local_IP> is by logging in to your router and checking which computers are connected with which IP address.
You now have your SSH server up and running and can at least access it from your home network if its already switched on.
Enable the server to be booted up remotely
Wake-on-LAN allows one to start the server remotely (from the home network). This however requires your server to be connected to the router by a LAN cable. The steps are described here.
Note: You should use
ifconfig to find out Ethernet device name (and its MAC address). Its not always eth0 like most of the articles show. It could be something like enp1s10.
On my MacBook Air I used the wakeonlan application and not powerwake as described in the How-to above:
sudo apt install wakeonlan
Now you can lazily switch on your server from your laptop within your home network and are all set to work remotely. One last note to work with jupyter notebook follows.
Set up Jupyter Notebook
For working with Jupyter Notebook remotely, you will have to set up some sort of port forwarding. Here’s an easy way to do it. The steps are:
- On the server, run Jupyter Notebook with:
jupyter notebook —-no-browser --port=8080
- On the laptop set up a tunnel with:
ssh -N -L 8080:localhost:8080 <remote_user>@<remote_host>
- On the laptop, go to the browser and type:
http://localhost:8080/.You will probably need to enter the token which you will get from the terminal.
So now youshould be all set to kaggle or do your AI-prototyping. Hopefully this article motivates you to go ahead with your own build. I wish you good luck and I look forward to hear the success stories (and idiosyncrasies!) of other builds.