Speed Up PyTorch by Building from Source on Ubuntu 18.04

Zhanwen Chen
Repro Repo
Published in
4 min readApr 4, 2019

In my experience, building PyTorch from source reduced training time from 35 seconds to 24 seconds per epoch for an AlexNet-like problem with CUDA, and from 61 seconds to 37 seconds on CPU-only. It is not clear to me why this is the case, and I may delve into this issue at a later time. For those who would like to experiment with this potential improvement, I wrote a guide that is hopefully clearer than the PyTorch official paragraphs on building from source. I will provide complete steps for both CUDA and CPU-only installs.

Note: at the time of writing, the latest Python is 3.7.3 and NumPy 1.16.3. Feel free to change these versions for your build.

  1. With CUDA

As of writing, PyTorch officially supports CUDA up to 10.0 unless PyTorch adds MAGMA support for CUDA 10.1 (i.e. when this package becomes available).

When building anything, it’s safer to do it in a conda environment lest you mess up and pollute your system environment. Let’s first install the prerequisite packages. Make sure you are using this environment for the rest of the article.

conda create --name pytorch-build python=3.7.3 numpy=1.16.3
conda activate pytorch-build # or `source activate pytorch-build`
conda install numpy pyyaml mkl mkl-include setuptools cmake cffi \ typing
conda install -c pytorch magma-cuda100

Then, we want to tell CMake (building tool) where to put the resulting files.

export CMAKE_PREFIX_PATH="$HOME/anaconda3/envs/pytorch-build"

As a precaution against older Anaconda symbolic linking mistakes, we temporarily rename its compatibility linker, before renaming it back later:

cd ~/anaconda3/envs/pytorch-build/compiler_compat
mv ld ld-old

Now, for some reason, PyTorch cannot find OpenMP out of the box, so we have to explicitly install OpenMP, a library for better CPU multi-threading:

sudo apt-get install libomp-dev

Now that we’ve done all the prep work, download PyTorch code into your home folder for convenience.

cd ~
git clone --recursive https://github.com/pytorch/pytorch

You may be able to ignore this paragraph, but for the sake of completion, there used to be an issue with the Intel ideep/mkldnn module version 0.17.3 on which PyTorch depended. However, Intel has since updated the submodule to 0.18.1 so you shouldn’t have to deal with it. However, your build gives you any problem, you may be able to follow this PyTorch Github thread.

If you have multiple cuda versions on your machine and you want to specify which one (or your system is failing to locate nvcc, you must specify the path for nvcc, CUDA_NVCC_EXECUTABLE. For example, if just doing nvcc doesn’t work, and you want to use CUDA 10.0, do:

export CUDA_NVCC_EXECUTABLE="/usr/local/cuda-10.0/bin/nvcc"
export CUDA_HOME="/usr/local/cuda-10.0"
export CUDNN_INCLUDE_PATH="/usr/local/cuda-10.0/include/"
export CUDNN_LIBRARY_PATH="/usr/local/cuda-10.0/lib64/"
export LIBRARY_PATH="/usr/local/cuda-10.0/lib64"

Anyway, you can now build and install the library with the following steps:

export USE_CUDA=1 USE_CUDNN=1 USE_MKLDNN=1
cd ~/pytorch
python setup.py install

Now remember to rename back the Anaconda compiler linker:

cd ~/anaconda3/envs/pytorch-build/compiler_compat
mv ld-old ld

You are done!

2. Without CUDA (CPU-only)

This is easier than the CUDA version. Let’s set up the environment, but without the need for MAGMA-CUDA support. Make sure you are using this environment for the rest of the article..

conda create --name pytorch-build python=3.7.3 numpy=1.16.3
conda activate pytorch-build
conda install numpy pyyaml mkl mkl-include setuptools cmake cffi \ typing

Then, we want to tell CMake (building tool) where to put the resulting files.

export CMAKE_PREFIX_PATH="$HOME/anaconda3/envs/pytorch-build"

As a precaution against older Anaconda symbolic linking mistakes, we temporarily rename its compatibility linker, before renaming it back later:

cd ~/anaconda3/envs/pytorch-build/compiler_compat
mv ld ld-old

Now, for some reason, PyTorch cannot find OpenMP out of the box, so we have to explicitly install OpenMP, a library for better CPU multi-threading:

sudo apt-get install libomp-dev

Now that we’ve done all the prep work, download PyTorch code into your home folder for convenience.

cd ~
git clone --recursive https://github.com/pytorch/pytorch

You may be able to ignore this paragraph, but for the sake of completion, there used to be an issue with the Intel ideep/mkldnn module version 0.17.3 on which PyTorch depended. However, Intel has since updated the submodule to 0.18.1 so you shouldn’t have to deal with it. However, your build gives you any problem, you may be able to follow this PyTorch Github thread.

Anyway, you can now build and install the library with the following steps:

export USE_CUDA=0 USE_CUDNN=0 USE_MKLDNN=1
cd ~/pytorch
python setup.py install

Now remember to rename back the Anaconda compiler linker:

cd ~/anaconda3/envs/pytorch-build/compiler_compat
mv ld-old ld

You are done!

3. Verify your installation

Still under the pytorch-build environment, let’s run some examples to make sure your installation is correct. First, let’s build the torchvision library from source.

cd ~
git clone git@github.com:pytorch/vision.git
cd vision
python setup.py install

Next, we must install tqdm (a dependency for downloading torchvision datasets) with pip in order to run the MNIST example. Otherwise download will error out.

pip install tqdm

Now download the examples and run MNIST:

cd ~
git clone git@github.com:pytorch/examples.git
cd examples/mnist
python main.py

Voilà!

--

--

Zhanwen Chen
Repro Repo

A PhD student interested in learning from data.