Build MXNet 1.9.1 with CUDA 11.7 and MKL BLAS using Visual Studio 2019 on Windows 10

3 min readAug 28, 2022

1. Install Visual Studio 2019 and its c++ module. (vs 2019 ver: 16.11.17)

2. Install Nvidia CUDA 11.7 update 1 and corresponding cudnn 8.4.1
2.1 CUDA custom installation, only development, runtime and visual studio integration are needed
2.1.1 Display driver update can be included optionally
2.2 Extract cudnn and merge the components into C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.7

3. Install chocolatey (within admin power shell)

4. choco install packages (within admin cmd)
4.1 choco install python -- version=3.9.13 (by default, pip and setuptools will be installed)
4.1.1 It will install to c:\python39
4.2 choco install opencv
4.2.1 By default → c:\tools\opencv, ver 4.5.5
4.2.2 Set env var OpenCV_DIR=C:\tools\opencv\build and add path: C:\tools\opencv\build\x64\vc14\bin and C:\tools\opencv\build\x64\vc15\bin
4.3 choco install cmake
4.3.1 → c:\program files\cmake, ver 3.24.0

5. Install oneMKL (MKLBLAS)
5.1 Download Intel onAPI Math Kernel Library for Windows, 2022.1.0, from https://www.intel.com/content/www/us/en/developer/articles/tool/oneapi-standalone-components.html#inpage-nav-9-5-undefined
5.2 Install by default (with vs2019 integration)

6. Get MXNet 1.9.1 patch release sourc package, apache-mxnet-src-1.9.1-incubating.tar.gz, from https://mxnet.apache.org/get_started/download
6.0 unzip to c:\, rename root folder to c:\mxnet
6.1 fix utf-8 issue
6.1.1 NEWS.md (under mxnet root)
6.1.1.1 line 1790 replace … with ... (ellipsis)
6.1.1.2 line 2314 replace … with ... (ellipsis)
6.1.2 np_pad_op-inl.h (mxnet\src\operator\numpy)
6.1.2.1 line 81 replace … with ... (ellipsis)
6.1.2.2 line 109 replace ‘constant’ with ‘constant’ (single quotation)
6.1.2.3 line 118, 119, 120 replace all four ‘ ’ with ‘ ‘ (single quotation)
6.2 Create c:\mxnet\build

7. Run cmake
7.1 “source” point to C:\mxnet
7.2 “binaries” point to C:\mxnet\build
7.3 click Configure
7.3.1 “generator”: visual studio 16 2019, “platform”: x64, use default …
7.3.2 finish, then configure,
7.3.2.1 BLAS MKL
7.3.2.2 BUILD_CPP_EXAMPLES disable
7.3.2.3 BUILD_TESTING disable
7.3.2.4 MKL_INCLUDE_DIR C:\Program Files (x86)\Intel\oneAPI\mkl\2022.1.0\include
7.3.2.5 MKL_ROOT C:\Program Files (x86)\Intel\oneAPI
7.3.2.6 MXNET_CUDA_ARCH 8.6 (or 6.1;7.5;8.6 etc. depends on Nvidia GPU hardware)
7.3.2.7 USE_MKLDNN enable (as needed)
7.3.2.8 Configure again (may need back and forth and Configure again)
7.4 Generate
7.5 Open project (vs 2019 will be launched)

8. In vs2019
8.1 Choose release — x64
8.2 Menu debug — ALL_BUILD Debug Properties
8.2.1 — CUDA C/C++ — Target Machine Platform: 64-bit
8.2.2 — CUDA C/C++ — Device — Code Generation: compute_86,sm_86
8.3 Build Solution, about 1 to 2 hours or longer depends on computer performance

9. Generate op.h (for c++ binding)
9.1 Copy following dlls to C:\mxnet\cpp-package\scripts
9.1.1 cudart64_102.dll, cudnn64_8.dll, cufft64_10.dll from C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.7\bin
9.1.2 libmxnet.dll, mxnet_86.dll from C:\mxnet\build\Release
9.1.3 mkl_rt.2.dll from C:\Program Files (x86)\Intel\oneAPI\mkl\2022.1.0\redist\intel64
9.1.4 zlibwapi.dll from http://www.winimage.com/zLibDll/zlib123dllx64.zip (see https://docs.nvidia.com/deeplearning/cudnn/install-guide/index.html#install-zlib-windows)
9.2 add os.add_dll_directory(‘C:\mxnet\cpp-package\scripts’) in OpWrapperGenerator.py
9.3 In cmd window, C:\mxnet\cpp-package\scripts, run python OpWrapperGenerator.py libmxnet.dll
9.4 op.h will be generated in C:\mxnet\cpp-package\include\mxnet-cpp

10. Deploy
10.1 Method 1 — deploy directly
10.1.1 admin cmd, c:\mxnet\python, run python setup.py install
10.2 Method 2 — create wheel and pip deploy (pip install wheel module first)
10.2.1 Update c:\mxnet\python\setup.py
10.2.1.1 Comment out
data_files=[(‘mxnet’, [LIB_PATH[0]])],
10.2.1.2 Add line
include_package_data=True,
10.2.2 Create a text file MANIFEST.in
10.2.2.1 add line in MANIFEST.in
recursive-include mxnet *
10.2.3 Copy following dll files into c:\mxnet\python\mxnet folder
cudart64_102.dll, cudnn64_8.dll, cufft64_10.dll, libmxnet.dll, mxnet_86.dll, mkl_rt.2.dll, opencv_world455.dll (from opencv\build\x64\vc14\bin)
zlibwapi.dll from http://www.winimage.com/zLibDll/zlib123dllx64.zip (see https://docs.nvidia.com/deeplearning/cudnn/install-guide/index.html#install-zlib-windows)
10.2.4 c:\mxnet\python run python setup.py bdist_wheel — plat-name win_amd64
10.2.5 Wheel will be created under C:\mxnet\python\dist with name: mxnet-1.9.1-py3-none-win_amd64.whl
10.2.6 Under dist subfolder run pip install mxnet-1.9.1-py3-none-win_amd64.whl
10.3 Method 3 — download all dependencies first and then install offline
10.3.1 Create a folder, e.g. dependencies
10.3.2 Create requirements.txt with following content
certifi==2022.6.15
charset-normalizer==2.1.0
graphviz==0.8.4
idna==3.3
numpy==1.23.1
requests==2.28.1
urllib3==1.26.11
10.3.3 Run pip download -r requirements.txt
10.3.4 When installing, under the dependencies folder run pip install -- no-index -- find-links ./ -r requirements.txt
10.3.5 Install mxnet wheel as 10.2.6

11. Validate
11.1 Run python
import mxnet as mx
a = mx.nd.ones(1)
b = mx.nd.ones(1,mx.gpu())
a.asnumpy()
b.asnumpy()

Build MXNet 1.9.1 with CUDA 11.7 and MKL BLAS using Visual Studio 2019 on Windows 10

Written by Relax Li