What is OpenVINO toolkit?
OpenVINO(Open Visual Inference and Neural network Optimization) is a free, open-source, Intel-developed toolkit that optimizes a deep learning model and due to the fact that it is a cross-platform toolkit, it runs on Windows, Linux, and macOS under Apache License.
Frameworks and formats such as Pytorch, Tensorflow, Caffe, MXNet, ONNX, and Kaldi are supported by OpenVINO. It is able to improve neural network performances (CNN) on Intel hardware and therefore, developers are able to come up with optimized solutions for AI problems. Intel distribution of OpenVINO also provides the well-known vision library, OpenCV. The OpenVINO toolkit includes the Deep Learning Deployment Toolkit (DLDT) and Open Model Zoo as its main components.
Deep Learning Deployment Toolkit (DLDT)
The deployment process could be challenging due to varieties of frameworks used for different purposes. Another reason which makes this process challenging is that inference might be performed on platforms which are restricted in terms of hardware and software; so the framework used for training is recommended not to be used. Instead, specific inference APIs optimized for hardware could be used. This toolkit allows developers to optimize the performances of their pre-trained models and also consists of two main components called: Model Optimizer and Inference Engine.
As the original document explains, the deployment process includes:
• Configure Model Optimizer for the framework which the model is trained by.
• Run Model Optimizer to produce an optimized Intermediate Representation (IR)of the model based on the trained network topology, weights and biases values, and other optional parameters.
• Test the model in the IR format using the Inference Engine in the target environment with provided Inference Engine sample applications.
• Integrate Inference Engine in your application to deploy the model in the target environment.
This command-line tool helps deep learning models execute optimally and
accelerates the deployment of a trained model.
First, the model should be trained by one of the toolkit-supported frameworks. The next step is to configure Model Optimizer for the trained model using the same framework. Then, use the trained network which has got the weights and biases as the input. Finally, run the Model Optimizer and it will give you an IR(Intermediate Representation) used as the input for Inference Engine.
Intermediate Representation (IR) :
This representation is to connect OpenVINO toolkit components and consists of 2 files which can be read and loaded: ‘.XML’ and ‘.bin’.
.XML: The topology file — an XML file that describes the network topology
.bin: The trained data file — a .bin file that contains the weights and biases binary data
Inference Engine is a runtime engine which takes the IR, made by Model Optimizer and optimizes its execution for the hardware being used and eventually it gives out a solution. The package includes runtime libraries, headers and samples for guidance.
What is OneAPI?
OneAPI is an Intel-developed toolkit which makes CPU and GPU programming simpler; it uses a language called DPC++ for parallelism. It makes the code reusable for the CPU and accelerator(GPU) while using a single source language.
The aim of oneAPI is to unify programming model, libraries, as well as simplifying cross-architecture development. It also includes libraries for deep learning and data science.
The libraries oneAPI provides:
• oneDNN: Deep Neural Network Library used for deep learning applications.
• oneCCL: A Collective Communications Library used for machine learning and deep learning projects
•oneDAL: Data Analytics Library used to make the big data analysis faster. It uses optimized algorithms for analysis processing.
What is the advantage of oneAPI?
It is crucial to be able to program different required architectures and it is even more significant for developers to maintain separate base codes written in different languages. It makes developing much slower and complex.
In summary, there are different types of workloads and therefore, there must be different architectures used for each workload. To reach a high performance, a mixture of SVMS (scalar, vector, matrix and spatial) architectures should be deployed on CPU, GPU, AI, and FPGA and this makes it complicated; but oneAPI will reduce the complexity of maintaining separate codebases, using different languages, tools and workflows.