How to setup a deep-learning-ready server with Intel NUC 8 + Nvidia eGPU + Docker

Interested in learning deep learning with Pytorch and/or Fast.ai? The easiest way to get started is to use a cloud-based solution (Google it, there’s a lot!). However, if you want to invest for a longer run or simply want to make your hands dirty with setting up a personal server, this post is for you!

After spending quite a lot of time researching and setting up my new Intel NUC Hades Canyon with the Nvidia GTX 1080 eGPU, I decided to write this blog so people like me can save some time and quickly get down the road.

To make it easy for you to follow the steps, I outlined here the general process, however, before answering ‘how’, I would like to explain ‘why’ first. Feel free to skip the first section if you believe in my decisions. ;)

  1. Why?
  2. Install Ubuntu Server 18.04
  3. Install Bolt
  4. Install Nvidia driver
  5. Install Docker CE
  6. Install Nvidia Docker
  7. Run a ML-ready docker image

1. Why eGPU, Intel NUC, and Docker?

External GPU

The name has already implied its advantage: portability. With a single gaming box, I can use it with any laptop that supports thunderbolt/usb-c (e.g., MacBook, NUC 7/8) without worrying about compatibility. Here I use the Aorus GTX 1080 Gaming Box .

Intel NUC 8 Hades Canyon

This mini PC brings you an Intel Core i7 processor, AMD Vega graphics power, and a spicy port mix. PCMag pointed out NUC 8’s top pros:

  • Compact and quiet-running
  • Excellent overall CPU and GPU performance
  • AMD Vega graphics are VR-ready
  • Bristling with connectivity for its size (of course, it has a pair of thunderbolt/usb-c ports)
  • Dual M.2 slots. VESA-mountable chassis

Docker

I would love to refer you to another blog post entitled “How Docker Can Help You Become A More Effective Data Scientist”, which explain everything you need to know about Docker, and of course, the reason why it is so important for data scientist. If you don’t have time, here is my brief explanation:

Docker is like a virtual machine, however we call it container. It’s worth to mention that they are actually 2 different technologies (virtualization vs. containerization). Container is more efficient in resource allocation, you don’t have to cutoff 4GB of physical memory for each container. Consequently, you can run many containers simultaneously. Moreover, it’s lightweight, you can start a container just in a couple of seconds. Last but not least, you can easily share your development environment (container image), which might include tens of tens of libraries.

Docker has an image repository called DockerHub, where you can share the image of your Docker container to others publicly (or privately, you decide). And this literally how we can easily setup a machine learning development environment (with numpy, pandas, scikit, pytorch, and everything you need) in a couple lines of code.


2. Ubuntu Server 18.04

It works, flawlessly. I just want to mention that in case you still concern that CUDA 9 doesn’t support Ubuntu 18.04 while PyTorch hasn’t support CUDA 10 yet. That’s it, I suppose that you know how to proceed with this.

Don’t forget to update and upgrade everything after the installation.

$ sudo apt update
$ sudo apt upgrade

3. Bolt

Installing

As I am using an eGPU which connects to the computer via a thunderbolt port, I need to install the bolt library first.

$ sudo apt install bolt

To check whether the GPU is recognized, run:

$ boltctl

The output should displays your eGPU information, such as:

● GIGABYTE GV-N1080IXEB-8GD
├─ type: peripheral
├─ name: GV-N1080IXEB-8GD
├─ vendor: GIGABYTE
├─ uuid: 00ef31d4-XXXX-XXXX-ffff-ffffffffffff
├─ status: authorized
│ ├─ domain: domain0
│ └─ authflags: none
├─ authorized: Thu 29 Nov 2018 09:23:37 AM UTC
├─ connected: Thu 29 Nov 2018 09:23:37 AM UTC
└─ stored: no

Authorizing

If the eGPU is unauthorized, you need to authorize it by manually changing the content of /sys/bus/thunderbolt/devices/0-0/0-1/authorized from 0 to 1. This can done with nano:

$ sudo nano /sys/bus/thunderbolt/devices/0-0/0-1/authorized

Don’t forget to save your changes!


4. Nvidia driver

Though being recognized, the eGPU doesn’t work yet without driver. First, we need to add the Nvidia PPA:

$ sudo add-apt-repository ppa:graphics-drivers/ppa

Now, we will start installing the driver, it will take awhile. You can try the latest drivers, I went with version 396:

$ sudo apt install nvidia-driver-396

Don’t forget to reboot after the driver installation:

$ reboot

Now you can check if the driver has been installed correctly with:

$ nvidia-smi

The output should be your eGPU status, something like:

Thu Nov 29 06:14:17 2018       
+-----------------------------------------------------------------+
| NVIDIA-SMI 396.54 Driver Version: 396.54 |
|-------------------------------+---------------+-----------------+
| GPU Name Persistence-M| Bus-Id Disp.A | ... |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | ... |
|===============================+===============+=================|
| 0 GeForce GTX 1080 Off | ... | N/A |
| 37% 30C P8 8W / 180W | .../ 8119MiB | 0% Default |
+-------------------------------+---------------+-----------------+

+-----------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=================================================================|
| No running processes found |
+-----------------------------------------------------------------+

5. Docker CE

You can find the full instructions on their website. I will only list all the commands needed here.

Setup repository

$ sudo apt update
$ sudo apt-get install \
apt-transport-https \
ca-certificates \
curl \
software-properties-common
$ curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
$ sudo add-apt-repository \
"deb [arch=amd64] https://download.docker.com/linux/ubuntu \
$(lsb_release -cs) \
stable"

Installing

$ sudo apt-get install docker-ce

6. Nvidia Docker

To take advantage of the GPU power, we also need the Nvidia Docker. The official instruction is available on their Github. Similarly, we need to add their repository first, then install the library. I list all the needed commands here.

Adding repository

$ curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | \
sudo apt-key add -
$ curl -s -L https://nvidia.github.io/nvidia-docker/ubuntu18.04/nvidia-docker.list | \
sudo tee /etc/apt/sources.list.d/nvidia-docker.list
$ sudo apt update

Installing

$ sudo apt install -y nvidia-docker2
$ sudo pkill -SIGHUP dockerd

7. Run a machine-learning-ready Docker image

Now the magic happens, we don’t need to install everything from scratch. First, you need to ask yourself what kind of environment you want. I have some interesting image here:

Here I will show you how to run the paperspace/fastai image:

$ sudo docker run --runtime=nvidia -d -p 8888:8888 \
paperspace/fastai:cuda9_pytorch0.3.0

To fully understand the parameters, I again strongly recommend you to read Hamel Husain’s article “How Docker Can Help You Become A More Effective Data Scientist”. When Docker cannot find the image paperspace/fastai:cuda9_pytorch0.3.0 at local, it automatically looks up on the DockerHub, download, and initialize a corresponding container. It will take awhile as the Fast.ai image includes a large dataset of cats and dogs photos. But no worries, it only happens for the first time.

Once the container has been up running, you might not be able to detect any changes (I told you, it is lightweight). To get the list of running containers, we can use the command:

$ sudo docker ps -a -f status=running

The output should be something like:

CONTAINER ID    IMAGE                                  COMMAND ...                 
bfddf9c3e0fa paperspace/fastai:cuda9_pytorch0.3.0 "jupyter...

Remember the container id, you will need it later.

When you run this container, it also establishes a Jupyter Notebook at port 8888. If you are also running Ubuntu Server like me, we will need the Jupyter Notebook’s token to access it from another computer. In order to execute the command jupyter notebook list to get the token, we can call:

$ sudo docker exec -it your_container_id jupyter notebook list

Remember to replace your_container_id with the ID given in the previous step.

That’s it! It’s time for you to test your setup in Jupyter Notebook. ;)


Thank you for reading to this point!

I hope that you find this tutorial somehow useful. Feel free to reach me out on Facebook or Github.


Originally published at gist.github.com.