End to End Text Recognition Model Deployment on CPU, GPU, and VPU With OpenVINO

Published in

The Startup

7 min readOct 12, 2020

Text detection and text recognition result. Image credits

Hello everyone,

In this post, I will walk you through all the steps of installing OpenVINO along with its dependencies, and then running a text recognition model on the edge using Intel CPU, integrated GPU(iGPU), and VPU.

What is special about this article?
* Consolidated guide to aid in inferencing pre-trained models of Intel OpenVINO model zoo.
* Addressing possible errors associated with OpenVINO toolkit installation and inferencing.
* Deploying the pre-trained model across Intel devices like CPU, GPU, and VPU.

Let’s get started:

Before going to the details of how to deploy a text detection and recognition pipeline on the edge. let me share some brief info on OpenVINO and its workflow.

A quick intro to OpenVINO Toolkit

OpenVINO toolkit stands for Open Visual Inference and Neural Network Optimization. The toolkit addresses two key aspects, which are Model Optimization and Inferencing.

Most of the deep learning models are always trained on powerful deep learning accelerators. But when it comes to real-time deployments, one may have to deploy them on low compute devices and at the edge or on on-prem servers, rather than deploying them on the cloud.

This can be addressed with the help of the OpenVINO toolkit to be able to deploy at the edge. Let’s understand the workflow details of deployment in the three steps mentioned below

Train a Model

Train your model using any of the popular Machine learning frameworks, that are available out there. (In this article, we shall skip the training part by taking a pre-trained model from the model zoo)
OpenVINO supports models trained on frameworks such as TensorFlow, Caffe, MXNet, Kaldi, and ONNX.
You can download a pre-trained model from the Open Model Zoo. I am selecting the text recognition and detection models, for this article.

2. Model Optimizer

Model Optimizer converts the trained model to an Intermediate representation (IR) by removing the redundant layers and function that is not required post model freezing
The imported models are converted into Intermediate Representation (IR) files (.XML and .bin) files.

1. *.xml - Describes the network topology
2. *.bin - Contains the weights (& biases) binary data

As I am downloading a pre-trained model from the model zoo, I already have XML and BIN files. In my subsequent article will discuss how to generate IR files for custom models. (Stay tuned :))

3. Inference engine

The inferencing engine runs on the IR files (.xml and .bin) obtained from step-2 or Intel pre-trained models available in IR format.

The inference engine is focused on Hardware optimizations to improve the model further

Inference Engine supports all Intel-based hardware

CPU (Central Processing Unit)
GPU (Graphical Processing Unit)
VPU (Visual Processing Unit) like Intel Neural Compute Stick 2 (NCS 2)
FPGA (Field Programmable Gate Array).

Hope you’ve got a brief intro about the OpenVINO for more details you can find the OpenVINO documentation here

Let’s start with the implementation…

Step 1: OpenVINO installation on Windows OS

Download (~193 MB)

To download the OpenVINO toolkit, the user has to register with the link here

Choose the OS (Windows)> Select distribution (web download) > Installer type (offline) > click (Register and download).
Choose Windows OS (radio button) > Fill the details
You’ll receive an e-mail with the URL to download.

Installation

OpenVINO installation

Following the above steps, the .exe file of the OpenVINO toolkit will be downloaded to your local windows machine in the downloads folder (default)

Installation of the toolkit is easy it is similar to any other software installation. During the installation procedure, you may notice that other dependencies are required since OpenVINO needs dependencies such as Microsoft VS2017, CMake, Python 3.6 or greater we will install soon

Microsoft VS2017 installation

OpenVINO needs Microsoft VS2017 to build C++ based pre-trained models

MSVC 2017 installer can be downloaded from here. Choose the Community version since it is free for students, researchers, and developers.

The vs_community.exe installer will be saved in the Downloads folder (default) and then install the software.

Navigate to Visual Studio Installer > Choose “Visual Studio Community” > Modify > In the tab “Workloads” > choose “Desktop development with C++” > Install while downloading (bottom left side).

PS. Couldn’t check the file size since I’ve kept it for download just before my bedtime😳

CMake (~25.2 MB)

CMake can be downloaded from here
Choose Windows win64-x64 Installer (.msi)

Note: During installation select “Add CMake to the System Path”

Python (~ 25 MB)

Python IDLE can be downloaded from the Python site here
Choose the python version ((preferably 3.6 or greater) and OS and download the executable installer
Follow the installation instructions

Note: Choose “Add Python 3.x to the PATH”

Hurray..! We have successfully completed the OpenVINO installation.

Step 2: OpenVINO configuration for Model deployment and Inferencing

Open command prompt (cmd)

Setup the environment variables to run the demo

cd C:\Program Files (x86)\Intel\openvino_2021.1.110\bin\setupvars.bat

You will have to intialize the environment for every fresh session

2. We need to build the demo applications for Windows OS

The open model zoo contains some pre-trained models like pose estimation, object detection, speech recognition, face recognition, etc.

you can check the available pre-trained models in the OpenVINO model zoo

https://github.com/openvinotoolkit/open_model_zoo/tree/master/demos

This step builds all the demo .cpp files. The build files are stored in the below-mentioned directory

Pre-trained model C++ files are available in the below directory

cd C:\Program Files (x86)\Intel\openvino_2021.1.110\deployment_tools\open_model_zoo\demos

Type the below command to build .cpp files of a pre-trained model using Microsoft Visual Studio 2017

build_demos_msvc.bat VS2017

You can check all the build files in the below directory

Documents\Intel\OpenVINO\omz_demos_build\intel64\Release

You can tinker with all the demo applications available in a model zoo.

3. Last step, we need to add the OpenCV debug and compile files to the build directory, OpenCV comes with OpenVINO installation.

The OpenCV is needed for the image processing task so we need to add the OpenCV bin folder to the model build directory

OpenCV bin files can be located in

C:\Program Files (x86)\Intel\openvino_2021.1.110\opencv\bin

Copy the .dll files from the above folder to the below-mentioned directory path

C:\Users\gk129\Documents\Intel\OpenVINO\omz_demos_build\intel64\Release

Step3: Text detection and recognition model

Fetching the model

As mentioned earlier We will not get into the nitty-gritty of model development of the Text recognition pipeline. Instead, we will use the Intermediate representation (IR) files from the model zoo to run the text recognition model

Downloading the intermediate representation model files (.xml & .bin) for both the text detection and text recognition models

IR files for text detection can be obtained from here:

For latest version of OpenVINO 2021 IR files (.bin & .XML )can be obtained here (download both the files)

IR files for text detection

https://download.01.org/opencv/2021/openvinotoolkit/2021.1/open_model_zoo/models_bin/1/text-detection-0003/FP16/

IR files for text recognition

https://download.01.org/opencv/2021/openvinotoolkit/2021.1/open_model_zoo/models_bin/1/text-recognition-0012/FP16/

Note: Here we are using FP16, A half-precision floating point numbers (FP16) have a smaller range. FP16 can result in better performance where half-precision is enough. (more details check here)

2. Save all the .bin & .XML files of both the text detection and recognition model in the folder(your choice). We need these to parse the address of these files to the text detection build model.

Running the model

Open a command prompt

cd C:\xxx\Documents\Intel\OpenVINO\omz_demos_build\intel64\Release

Then we will use the Text_detection_demo build file and check all the arguments required as an input

text_detection_demo.exe -h

h indicates the arguments for the model

Deployment on the Edge: CPU

To run the text recognition model on the intel powered CPU

type the following command in the above command prompt

Note: parse the address of .xml file of text detection and text recognition

i > “Path to the image file”,
-m_tr > .XML file of text recognition model,
-m_td > .XML file of text detection model,
Target devices (-d_tr, -d_td) choose CPU

text_detection_demo.exe -i C:\xxx\xxx\test.jpg -m_tr "C:\xxx\xxx\xxx\text-recognition-0012.xml" -m_td "C:\xxx\xxx\xxx\\text-detection-0003.xml" -d_tr CPU -d_td CPU

Deployment on the VPU: Intel Neural Compute Stick 2(NCS2)

Intel Neural Compute stick (Image credits: Intel )

The Intel Neural Stick 2 is a powerful Plug and Play tiny USB based Virtual Processing Unit for AI model Inferencing and rapid prototyping. The NCS2 can be integrated with SBC like Raspberry Pi and with other embedded boards.

If you would like to know more details about NCS2 you can check here.

To deploy on the Intel NCS 2, all the procedure remains till the step running the model

Choose Target devices (-d_tr, -d_td)as MYRIAD

text_detection_demo.exe -i C:\xxx\xxx\test.jpg -m_tr "C:\xxx\xxx\xxx\text-recognition-0012.xml" -m_td "C:\xxx\xxx\xxx\text-detection-0003.xml" -d_tr MYRIAD -d_td MYRIAD

Deployment on the GPU (Intel integrated GPU)

Intel CPU comes with the Integrated standard GPU onboard, we can utilize iGPU for the deployment with the following command

text_detection_demo.exe -i C:\xxx\xxx\test.jpg -m_tr "C:\xxx\xxx\xxx\text-recognition-0012.xml" -m_td "C:\xxx\xxx\xxx\\text-detection-0003.xml" -d_tr GPU -d_td GPU