Build OpenCV from source with CUDA for GPU access on Windows

Ankit Kumar Singh
Analytics Vidhya
Published in
8 min readOct 4, 2020

Introduction

Opencv is an extremely useful library in Computer Vision. A common issue which python programmers face with the Opencv module is, if we install it directly with “pip” or “conda” it uses CPU for Inferencing. Opencv has deeplearning module “DNN” which by-default uses CPU for its computation.
Opencv with GPU access will improve the performance multiple times depending on the GPU’s capability. For this to work we have to compile the source code of Opencv with Nvidia GPU, CUDA, and cuDNN by using tools like CMake and Visual Studio which uses c++’s GCC compiler.
The main reason why I made this blog is that it consumes a huge amount of time, searching for installation and I found there is no proper documentation for my case. It's easy to install Opencv on the Linux machine but it is hard for Windows.
For installing Opencv on Linux machine:
https://www.pyimagesearch.com/2016/07/11/compiling-opencv-with-cuda-support/
This blog is for building OpenCV from the source in Windows Machine

Our Instance’s specifications

Just for a reference, I am building OpenCV in a <AWS-EC2-g4.xlarge> with a Windows AMI instance with a Tesla T4 GPU with a capacity of 16gb and 4 core CPU. I’m accessing this instance through an RDP(Remote Desktop Protocol) connection from my local Linux machine.
Anyway, This blog just cares for a Windows machine with a Good GPU

Steps

The main idea here is to download the opencv and opencv-contrib package from the source. Then configure and compile (build) the packages through CMake and visual studio in a folder named “build”.

1. Download and install Visual Studio 19

1.1. Download latest Community edition Visual Studio, In my case, it is VS19:- https://visualstudio.microsoft.com/downloads/
1.2. Click on Desktop development with C++, and Continue with defaults and install

2. Download and install CMake (my version 3.18.3)

2.1 Click on this link: https://cmake.org/download/

3. Install CUDA and cuDNN according to your GPU

3.1. Click on my another link :- https://medium.com/@ankitkumar60323/installing-cuda-and-cudnn-on-windows-d44b8e9876b5
3.2. Follow this wikipedia page for getting architecture binary of your gpu
https://en.wikipedia.org/wiki/CUDA
In my case architecture binary is 7.5. CUDA version is 10.1 and cuDNN version is 7.6.5.

4. Uninstall Anaconda and install python for all user

There will be path issues if we do not follow this part
4.1. Goto “Installed programs” and uninstall anaconda and python
4.2. Check the “system environment variables” and remove anaconda and python form the path

4.3. Now install python and proceed with “custom installation” with all user permissions

4.4. Check in command prompt if “python” is detected and if not then give path to python executable in “system environment variables”

5. Download and extract Opencv-4.4 from Github

This is the source code where opencv code resides
5.1. Follow this link:- https://opencv.org/releases/

5.2. click on sources
5.3. extract the downloaded folder

6. Download and extract Opencv-contrib-4.4 from github

This is extra modules package which is used along with opencv
6.1. Goto link https://github.com/opencv/opencv_contrib/tree/4.4.0 and download zip
6.2. Extract the downloaded folder

7. Install numpy and uninstall opencv-python, opencv-contrib-python

Before compiling make sure “numpy” is installed. Make sure that “opencv-python” and “opencv-contrib-python” is uninstalled and will never be installed again using “pip” in this environment again
7.1. pip install numpy
7.2. pip uninstall opencv-python opencv-contrib-python
7.3. in my case these are python paths required by cmake:

PYTHON3_EXECUTABLE= C:/program files/python38/python.exe
PYTHON3_INCLUDE_DIR= C:/program files/python38/include
PYTHON3_LIBRARY= C:/program files/python38/libs/python38.lib
PYTHON3_NUMPY_INCLUDE_DIRS= C:/program files/python38/Lib/site-packages/numpy/core/include
PYTHON3_PACKAGES_PATH= C:/program files/python38/Lib/site-packages

8. Make a “build” folder

This is the folder where we will compile and save the object code
8.1. We need to make build folder where we will compile Opencv

9. Make changes in opencv’s cmake file “OpenCVDetectPython.cmake”

What happens is when we build with cmake, by default it search for python2. We need to change the code so that it detects python3 by default and use python2 as fallback
In the extracted folder named “opencv-4.4.0”
9.1. Open file “opencv-4.4.0\opencv-4.4.0\cmake\OpenCVDetectPython.cmake”
9.2. Edit last code segment of the file and replace the code with this code:
replace this part:-

if(PYTHON_DEFAULT_EXECUTABLE)
set(PYTHON_DEFAULT_AVAILABLE “TRUE”)
elseif(PYTHON2INTERP_FOUND)
# Use Python 2 as default Python interpreter
set(PYTHON_DEFAULT_AVAILABLE “TRUE”)
set(PYTHON_DEFAULT_EXECUTABLE “${PYTHON2_EXECUTABLE}”)
elseif(PYTHON3INTERP_FOUND)
# Use Python 3 as fallback Python interpreter (if there is no Python 2)
set(PYTHON_DEFAULT_AVAILABLE “TRUE”)
set(PYTHON_DEFAULT_EXECUTABLE “${PYTHON3_EXECUTABLE}”)
endif()

with this code:-

if(PYTHON_DEFAULT_EXECUTABLE)
set(PYTHON_DEFAULT_AVAILABLE "TRUE")
elseif(PYTHON3INTERP_FOUND)
# Use Python 3 as default Python interpreter
set(PYTHON_DEFAULT_AVAILABLE "TRUE")
set(PYTHON_DEFAULT_EXECUTABLE "${PYTHON3_EXECUTABLE}")
elseif(PYTHON2INTERP_FOUND)
# Use Python 2 as fallback Python interpreter (if there is no Python 3)
set(PYTHON_DEFAULT_AVAILABLE "TRUE")
set(PYTHON_DEFAULT_EXECUTABLE "${PYTHON2_EXECUTABLE}")
endif()

10. Configure Opencv and Opencv-contrib using Cmake

Now we will configure Opencv according to our machine configurations like paths of CUDA, cuDNN, GPU architecture etc
10.1. Open cmake-gui app which we installed earlier (Step 2)
10.2. give path to source code and path to ‘build’ folder(section 8) for binaries in my case:-
C:/Users/Administrator/Downloads/opencv-4.4.0/opencv-4.4.0
C:/Users/Administrator/Downloads/build

10.3. hit configure
10.4. select the optional platform as x64 and click finish

10.5. In the output section below configure button goto section named “OpenCV modules”. In that section look at value of “To be built”, there should be python3 somewhere. Make sure that the paths described in (section 7.3) is well checked

10.6. If 10.5 is not satisfied then there is a path issue you need to resolve it first

10.7. Now if above section is all right then configure following variable in cmake by searching that variable in search tab:

WITH_CUDA — Checked
OPENCV_DNN_CUDA — Checked
ENABLE_FAST_MATH — Checked
OPENCV_EXTRA_MODULES_PATH — “Give path to “opencv-contrib-4.4.0” directory(see section 6) by pointng at “modules” directory(in my case: C:\Users\Administrator\Downloads\opncv-contrib-4.4.0\opencv-contrib-4.4.0\modules) “

10.8. Hit configure button again and wait for “configuration done” output
10.9. Now we will again configure some more variables, we can say the second round of configuration

CUDA_FAST_MATH — Checked
CUDA_ARCH_BIN — 7.5 (This is strictly for my case)

If you want to know your GPU’s arch_bin follow Wikipedia link: https://en.wikipedia.org/wiki/CUDA and look for the table and select binary according to your GPU model name

10.10. Hit configure again and wait for “configuration done” output
10.11. Hit the Generate button and wait for “Generating done” output

10.12. Your configuration and generation of code are done. Now you can close the cmake-gui app

11. Build and compile The project created by Cmake with Visual Studio

Now we will compile and build the code which was written and configured by cmake in the folder named “build”
11.1. Goto “build” folder using command prompt
11.2. Type OpenCV.sln and hit enter. After that, it will open Visual Studio for us

11.3. If Visual Studio opens with error tab then just follow this:-

11.3.1. In VS goto Tools>Options
11.3.2. In Options panel navigate to “Projects and Solutions”>Web Projects
11.3.3. Uncheck the last option in that tab

11.3.3

11.3.4. Hit ok and restart Visual Studio by step 11.1 and 11.2

11.4. When in VS change “debug” mode to “release” Mode

11.5. In VS At right expand “CMakeTargets”
11.6. Right-click “ALL_BUILD” and Hit build. This will take around 30 minutes.
11.7. After 11.6, right-click “INSTALL” and hit build. This will be done shortly.

step 11.6
output of ALL BUILD
Output of Installing

11.8. Make sure there are no errors in 11.6 and 11.7. If so make sure that you clicked on release mode(section 11.4)
11.9. Finally You have built and compiled opencv with CUDA, cuDNN, and GPU access
11.10. Close Visual Studio

12. Confirm if installations are done correctly

12.1. Open the command prompt and hit python
12.2. Type the following:-

12.3. If 12.2. is the case then you have installed OpenCV with GPU successfully. Congratulations

Conclusion

This was a very long and hectic process we’ve been through. OpenCV should have the GPU detection process inbuilt which might ease the process of building it again and again for new environments.

References

--

--