Nvidia eGPU + MacOS + TensorFlow-GPU? The Definitive Setup Guide to Avoid Headaches

Uxío Piñeiro

Published in

xplore.ai

9 min readJan 30, 2019

One of our eGPUs with a Nvidia GTX 1080ti mounted

We have already suffered it, let us save you a couple of days of desperation.

Tim Cook and Nvidia are going to go to hell for this
— Myself after a lot of hours of fighting

At xplore.ai we are building Artificial Intelligence products using tailor-made algorithms. We create Deep Learning solutions to problems, mainly in the area of Computer Vision, for clients that hold large amounts of data and need a level of custom solutions further than what they can be provided from using already existing APIs. Our team’s main choice of local development hardware is Apple MacBooks, thus we mainly work on MacOS. Although we started training our Deep Learning models on the cloud (AWS), it became more and more expensive when our level of innovation increased. So we decided to build up our own high-end desktop machine, we call it Franky, where we would train large architectures or big Machine Learning models.

Training a GAN (Generative Adversarial Network) with a training set of 100.000 images with a single GPU can take various weeks

The main problem appeared when we started growing as a company, only one model could be train at once so there is now a queue of teammates waiting to train their own models for their own tasks. So we decided to investigate the possibility of providing a GPU to each data team member so they could use it with their own laptop. A GPU is normally attached to a motherboard, which is capable of handling such a monster (PCIe), which is inside a big case, which is not a laptop… but… welcome to the external GPUs world.

Thunderbolt 3: Taking USB-C ports to the next level

Taking into account that we already had available high-end GPUs our best option was Razer Core X, but you can select yours with the help of the great eGPU community. I’m not going to lie, we chose Core X because its powerful specs and, of course, for its slickly design. So far so good, it had not disappointed us, it has demonstrated to be a high-end tool and amazingly silent.

Note: the included Thunderbolt 3 cables with the Razer Core X are quite short (0.5m). Consider this and the fact that buying a 1m version can cost easily 40eur and we couldn’t find a 2m version suitable for our scenario.

So far everything looks fine: buy the case online, have it delivered in a few days (how nice it is to live in this world of popularized e-commerce), easily plug your graphics card and we are done 😊. Not so fast… this is only the beginning of our odyssey 😦.

Note: We work with CUDA software so AMD wasn’t an option. If you’re a lucky AMD user, you can close the post here and enjoy your new GPU, lucky you!

I’m sure you have heard about the ridiculous war between Nvidia and Apple which leaves us, the consumers, totally stranded. Nvidia no longer distributes official drivers to Apple MacOS is no longer officially compatible with Nvidia GPUs. So if you like to be up to date with the latest Apple software prepare to say good bye to Mojave and its very high-end dark mode technology because, at least, by the time I’m writing this, we need to downgrade to High Sierra.

Once we have our clean MacOS 10.13 ready to be eGPU boosted we need to start tricking the system, again the eGPU community will save our backs. You can follow the official Purge-Wrangler.sh guide here or continue to read because I’m going go through all the process as friendly as I can.

Installing eGPU on MacOS

1. Disable SIP

Reboot the system into Recovery Mode (⌘+R during boot), then in the upper bar open Utilities > Terminal and:

csrutil disable

Reboot again, this time normally.

2. Install Purge-Wrangler.sh

With the eGPU plugged-in before login, open Terminal App (iTerm not supported):

curl -s "https://api.github.com/repos/mayankk2308/purge-wrangler/releases/latest" | grep '"browser_download_url":' | sed -E 's/.*"([^"]+)".*/\1/' | xargs curl -L -s -0 > purge-wrangler.sh && chmod +x purge-wrangler.sh && ./purge-wrangler.sh && rm purge-wrangler.sh

For future uses:

purge-wrangler

NOTE: Script will need permissions, don’t panic, this thing is probed. It shouldn’t break anything, but shit happens, it’s not bad practice to do a backup.

Very simple GUI will display in the terminal, select the setup you need (Nvidia eGPU installation is 2) and reboot after the script had done its things.

eGPU Plug — Unplug recommendations

Until MacOS natively supports Hot-Unplug you should unplug your eGPU after shutdown or while Mac is off during reboot. Theoretically you can plug your eGPU at any moment but is recommended to do it before login.

First test you could do, go to About This Mac from the Apple menu () in the upper-left corner of your screen. You should see something like this:

At this point very likely you are in one of this two situations:

Everything went well and you’re already enjoying the power of your new eGPU boosted Mac. Continue.
Everything went well but you are experimenting a bit of lag on the MacOS UI when eGPU acceleration is in use, try using DisableMonitor to disable the built-in retina display, not fancy but it worked for me. If you find a better solution please leave a comment with it.
Something went wrong. Prepare yourself to go on a wonderful journy across the forums to find your problem and a solution that solves your problem.

Once you have it I personally recommend to try some eGPU boosted applications, you can test it with a game or a benchmark like Valley Benchmark, so satisfying by the way. But surely you want to test your eGPU boost with your heavy Deep Learning models.

I’d bet that if you had problems installing the eGPU after the time spent you are rethinking about how much profitable was going trough all that hell. But think that most of the way is done, congratulations! Maybe.

Remember when I wrote about that war between Nvidia and Apple? Well its consequences didn’t end there. If you’re a Data Scientist who has worked a bit with Tensorflow, you surely know this but if it not the case I will remember it, TensorFlow GPU works with CUDA, a Nvidia software, so as Nvidia hardware is no longer officially supported by Apple, for the Tensorflow team it does not make sense to give an oficial MacOS Tensorflow-GPU support anymore.

Again we are going to trick the system. In the best case you will only need to use the last tensorflow-gpu-macos pre-build, you can try it by:

pip3 install tensorflow-gpu-macosx

NOTE: If your specific configuration match with CUDA 10, cuDNN 7.4, Python 3.6.8, we have this gift for you, our own Tensorflow1.8 GPU build wheel. Download and try:

pip install *.whl

If this is not the case welcome to the tutorial of how to build Tensorflow GPU to your own configuration, this tutorial is based on this great repository by @zylo117, to which we have contributed to. You can follow the process there or continue reading.

How to build Tensorflow GPU

1. You need to have your CUDA and cuDNN environment working. If you come here as Deep Learning developer you should have done this many times. If CUDA is installed you can check version by:

nvcc -V

Else: Cuda Toolkit | cuDNN installations(NVIDIA Developer account needed, follow the instructions)

2. Install XCode/Command Line Tools 9.3+(recommended, older versions could work)

You can easy check if installed version by:

gcc -v

Else: XCode | Command Line Tools installations (Apple Developer account needed)

NOTE: If you want to downgrade your Command Line Tools, is recommended to remove the old files by:

rm -rf /Library/Developer/CommandLineTools/

3. Install CoreUtils & LLVM by Homebrew

If you don’t know what Homebrew is you should read about it. Once you have it:

brew install coreutils llvm

4. Install OpenMP

brew install cliutils/apple/libomp

Easy steps.

5. Install Bazel 0.16.1 from GitHub (newer/older version may cause failure)

Follow Bazel instructions.

6. Clone Tensorflow-GPU-MacOS repo

git clone https://github.com/zylo117/tensorflow-gpu-macosx
cd tensorflow-gpu-macosx

7. Config (to your custom CUDA and cuDNN build)

./configure  #Please specify the location of python.: Accept the default option
    #Please input the desired Python library path to use.:  Accept the default option
    #Do you wish to build TensorFlow with Google Cloud Platform support? [Y/n]: n
    #Do you wish to build TensorFlow with Hadoop File System support? [Y/n]: n
    #Do you wish to build TensorFlow with Amazon S3 File System support? [Y/n]: n
    #Do you wish to build TensorFlow with XLA JIT support? [y/N]: n
    #Do you wish to build TensorFlow with GDR support? [y/N]: n
    #Do you wish to build TensorFlow with VERBS support? [y/N]: n
    #Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: n
    #Do you wish to build TensorFlow with CUDA support? [y/N]: y
    #Please specify the CUDA SDK version you want to use, e.g. 7.0.: Put yours
    #Please specify the location where CUDA 9.1 toolkit is installed.: Accept the default option
    #Please specify the cuDNN version you want to use.: Put yours
    #Please specify the location where cuDNN 7 library is installed.: Accept the default option
    ##Please specify a list of comma-separated Cuda compute capabilities you want to build with.
    ##You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus. (GTX10X0: 6.1, GTX9X0: 5.2)
    #Please note that each additional compute capability significantly increases your build time and binary size.: 6.1,5.2,5.0,3.0
    #Do you want to use clang as CUDA compiler? [y/N]: n
    #Please specify which gcc should be used by nvcc as the host compiler.: Accept the default option
    #Do you wish to build TensorFlow with MPI support? [y/N]: n
    #Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified:  Accept the default option
    #Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: n# If it haven't done before:
export CUDA_HOME=/usr/local/cuda
export DYLD_LIBRARY_PATH=/usr/local/cuda/lib:/usr/local/cuda/extras/CUPTI/lib
export LD_LIBRARY_PATH=$DYLD_LIBRARY_PATH
export PATH=$DYLD_LIBRARY_PATH:$PATH

# bazel clean --expunge # Use this if you failed to compile before.

8. Build

This step will take a few hours, it’s time to run the following comment and watch one of the movies you’ve been procrastinating forever 🍿.

bazel build --config=cuda --config=opt --cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0" --action_env PATH --action_env LD_LIBRARY_PATH --action_env DYLD_LIBRARY_PATH //tensorflow/tools/pip_package:build_pip_package

9. NCCL Patch

You can compile NCCL_OPS:

gcc -c -fPIC ./nccl_patched/nccl_ops.cc -o ./nccl_patched/_nccl_ops.ogcc ./nccl_patched/_nccl_ops.o -shared -o ./nccl_patched/_nccl_ops.so

Then replace the original nccl lib:

mv ./bazel-bin/tensorflow/contrib/nccl/python/ops/_nccl_ops.so ./bazel-bin/tensorflow/contrib/nccl/python/ops/_nccl_ops.so.bkcp ./nccl_patched/_nccl_ops.so ./bazel-bin/tensorflow/contrib/nccl/python/ops/

10. Build Python binding

./bazel-bin/tensorflow/tools/pip_package/build_pip_package/ /tmp/tensorflow_pkg

Finally. Install python wheel

pip3 install /tmp/tensorflow_pkg/*.whl

And don’t forget to share your .whl with the world by Forking + Pull Requesting on here, you can save a lot of time to a lot of people, think about it this way.

Verify your tensorflow-gpu installation

python
>>> import tensorflow as tf
>>> hello = tf.constant('Hello, TensorFlow!')
>>> sess = tf.Session()
...
#Something like
...
name: GeForce GTX 1070 major: 6 minor: 1 memoryClockRate(GHz): 1.7845
pciBusID: 0000:c4:00.0
totalMemory: 8.00GiB freeMemory: 5.62GiB
...
2019-01-25 18:22:02.776549: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 5402 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1070, pci bus id: 0000:c4:00.0, compute capability: 6.1)
>>> print(sess.run(hello))
b'Hello, Tensorflow!'

*Out of the tensorflow-gpu-macos dir.

If you got here, congratulations, you’ve made it! 👏👏👏

I hope you have found this guide useful. In such case, don’t forget to clap it, share it and subscribe! Should you have any change requests on this guide, you can leave a comment below. I’ll try to answer / redirect everything as fast as I can.

Check our services and explorations with Deep Learning, Machine Learning, Computer Vision and GANs on our LinkedIn page, Twitter, @xplore.ai Instagram account and don’t forget to follow us on Medium to not miss any post like this in the future.