Supercharge Your Deep Learning Models with OpenVINO™

Published in

OpenVINO-toolkit

8 min readAug 3, 2023

Have you ever found yourself in this situation: You’ve developed an impressive deep-learning model using TensorFlow, and it’s incredibly accurate. However, when it comes to performance across various hardware devices, it leaves much to be desired. This was a challenge I encountered when I was working on a project to streamline checkout lanes using smart queue management, utilizing the YOLOv8 PyTorch model.

That’s when I discovered OpenVINO™ 2023.0, a toolkit specifically designed to amplify the speed and efficiency of deep learning models across different hardware. However, OpenVINO™ 2023.0 is not just about enhancing performance; it also streamlines the often complex process of AI development. With its range of tools and optimizations, OpenVINO™ simplifies the conversion and deployment of models, making development more accessible and efficient.

In this guide, we’ll explore how OpenVINO™ 2023.0 enables the conversion of various deep learning model formats, such as TensorFlow, TensorFlow Lite, PaddlePaddle, and PyTorch into OpenVINO’s own Intermediate Representation (IR) format. This blog aims to provide a thorough understanding of the key aspects of OpenVINO, its tools, and its transformative role in optimizing neural networks to both experienced developers and AI beginners alike.

Figure 1: General workflow for deploying TensorFlow/TensorFlow Lite model

This diagram shows a simplified workflow for deploying a pre-trained TensorFlow or TensorFlow Lite model using OpenVINO™ 2023.0. Each model has its unique strengths — TensorFlow is known for training complex deep learning models, TensorFlow Lite for enabling lower latency and smaller binary size on mobile and edge devices, or others, OpenVINO™ streamlines the process. This process includes the direct loading of TensorFlow or TensorFlow Lite models, and the conversion to the Intermediate Representation (IR) format using OpenVINO™

Before we dive in, let’s ensure all the required dependencies are installed. Beyond performance improvement, the OpenVINO™ toolkit also demystifies the intricate process of optimizing and deploying AI models.

# Installing the OpenVINO toolkit
!pip install -q "openvino>=2023.0.0"

# Installing additional Python libraries
!pip install requests tqdm
import urllib.request
import cv2
import os
import numpy as np

# OpenVINO's runtime module
from openvino.runtime import Core, serialize
from pathlib import Path

# Fetching the notebook_utils module which provides utility functions for visualizing results, converting segmentation maps to images etc.
urllib.request.urlretrieve(
    url='https://raw.githubusercontent.com/openvinotoolkit/openvino_notebooks/main/notebooks/utils/notebook_utils.py',
    filename='notebook_utils.py'
)
from notebook_utils import viz_result_image, segmentation_map_to_image, SegmentationMap, Label, download_file

Model Optimization

While the latest OpenVINO™ 2023.0 release offers the flexibility to deploy models without the need to convert to the Intermediate Representation (IR) format, there might be cases where conversion to IR can provide specific optimizations or compatibility benefits. Even though this conversion is optional, some developers might find it useful. In this section, we’ll define a function called optimize_model to convert various model formats to OpenVINO’s IR format if you choose to use this option.

def optimize_model(model_url, model_name, model_format, device_name="CPU"):
    # Check if model_url is a URL
    if model_url.startswith('http'):
        model_path = f"model/{model_name}"
        # Download the model if it doesn't exist locally
        if not os.path.exists(model_path):
            download_file(model_url, filename=model_name, directory='model')
    else:
        if not os.path.exists(model_url):
            raise Exception(f"File {model_url} does not exist.")
        model_path = model_url

    # Initialize Core object
    core = Core()

    # Read the model
    model = core.read_model(model_path)

    # Compile model for specific device
    compiled_model = core.compile_model(model=model, device_name=device_name)

    # Serialize model to Intermediate Representation (IR)
    serialize(model, xml_path=f"model/exported_{model_format}_model.xml")

The function checks if the model exists locally, downloading it if necessary. Then, it uses OpenVINO’s Core API to read the model, compile it for the specified device, and serialize it into the IR format, making it ready for efficient deployment using OpenVINO.

Now, let’s optimize each deep learning model format using the `optimize_model` function:

Handling TensorFlow Models

Starting with OpenVINO™ 2023.0, there’s enhanced support for TensorFlow. Renowned for its comprehensive feature set, TensorFlow empowers developers to build and train complex AI models with ease. It’s particularly effective for neural networks and large-scale machine learning tasks. Let’s walk through an example of downloading a TensorFlow classification model and optimizing it using OpenVINO™. We utilize the optimize_model function defined earlier for this process.

tf_model_url = 'https://storage.openvinotoolkit.org/repositories/openvino_notebooks/models/002-example-models/classification.pb'
tf_model_name = 'classification.pb'
optimize_model(tf_model_url, tf_model_name, 'tf')

Working with TensorFlow Lite Models

OpenVINO™ 2023.0 provides comprehensive support for TensorFlow Lite, a framework designed specifically for fast and efficient execution of AI models on mobile and edge devices. TensorFlow Lite maintains low latency and a compact binary size, making it a fitting choice for environments with resource constraints. Now, let’s see how we can optimize a TensorFlow Lite model using OpenVINO™.

tflite_model_url = 'https://tfhub.dev/tensorflow/lite-model/inception_v4_quant/1/default/1?lite-format=tflite'
tflite_model_name = 'classification.tflite'
optimize_model(tflite_model_url, tflite_model_name, 'tflite')

Similar to the TensorFlow model, we’ve directly loaded the TensorFlow Lite model using core.read_model, compiled it for a specific device using core.compile_model, and then serialized it to OpenVINO’s Intermediate Representation (IR) format using the serialize function. The optimize_model function effectively streamlines this process not only for TensorFlow and TensorFlow Lite, but also for other model formats. With OpenVINO’s unified API, this workflow becomes even more straightforward, aiding in quick AI model deployment regardless of their original formats. The simplicity and consistency of this process will be further demonstrated as we explore other formats like PaddlePaddle and PyTorch next.

Processing PaddlePaddle Models

OpenVINO™ 2023.0 now offers support for models from PaddlePaddle. This open-source deep learning platform, created by Baidu, is known for its application in industrial use cases. It’s adept at handling various tasks, from machine translation and image classification to object detection.

In this example, we’re working with a specific PaddlePaddle model. Note that PaddlePaddle models come with a `.pdmodel` file, which describes the structure of the model, and a `.pdiparams` file, which contains the learned parameters:

paddle_model_url = 'https://storage.openvinotoolkit.org/repositories/openvino_notebooks/models/002-example-models/'
paddle_model_name = 'inference.pdmodel'
paddle_params_name = 'inference.pdiparams'

# Download PaddlePaddle model and its parameters
download_file(paddle_model_url + paddle_model_name, filename=paddle_model_name, directory='model')
download_file(paddle_model_url + paddle_params_name, filename=paddle_params_name, directory='model')

# Define model path
model_path = "model/" + paddle_model_name

# Call the optimize_model function
optimize_model(paddle_model_url + paddle_model_name, paddle_model_name, model_format='paddle', device_name="CPU")

The PaddlePaddle model follows the same process: downloaded, read into OpenVINO, compiled, and serialized to the Intermediate Representation (IR) format. Now, it’s ready for inference with OpenVINO™. The process remains consistent and efficient, irrespective of the model format.

Importing PyTorch Models

PyTorch models can also be converted to OpenVINO’s IR format. PyTorch is a popular choice for deep learning research due to its dynamic and interactive nature. Powered by GPUs, it’s known for its flexibility in supporting rapid prototyping and high-level customizations.

The process of converting a PyTorch model to OpenVINO’s Intermediate Representation (IR) format does involve an extra step. We start by downloading the PyTorch model and then convert it to the ONNX format. Once we have our model in ONNX format, we load it into OpenVINO. We compile the model for the device we want to use and then serialize it into OpenVINO’s IR format. This provides a consistent workflow, even when working with different model formats.

We’ll start by importing the necessary libraries and downloading the LRASPP MobileNetV3 model from PyTorch for semantic segmentation:

# Installing ONNX
!pip install -q "onnx>=1.11.0"

import warnings
import torch
from torchvision.models.segmentation import lraspp_mobilenet_v3_large, LRASPP_MobileNet_V3_Large_Weights

# Define image dimensions for input
IMAGE_WIDTH = 780
IMAGE_HEIGHT = 520
DIRECTORY_NAME = "model"
BASE_MODEL_NAME = DIRECTORY_NAME + "/lraspp_mobilenet_v3_large"
weights_path = Path(BASE_MODEL_NAME + ".pt")

# Download the model if it hasn't been downloaded yet
print("Downloading the LRASPP MobileNetV3 model (if it has not been downloaded already)...")
download_file(LRASPP_MobileNet_V3_Large_Weights.COCO_WITH_VOC_LABELS_V1.url, filename=weights_path.name, directory=weights_path.parent)


# Load the model 
model = lraspp_mobilenet_v3_large()
# Load the saved state dict into the model
state_dict = torch.load(weights_path, map_location='cpu')
# load state dict to model
model.load_state_dict(state_dict)
# Switch to evaluation mode for inference
model.eval()
print("Loaded PyTorch LRASPP MobileNetV3 model")

Exporting the PyTorch Model to ONNX:

Here, the PyTorch model is exported to the ONNX format. This involves providing a dummy input with the correct dimensions to the model, and then using torch.onnx.export() to convert the model to ONNX format. The exported ONNX model is saved to the specified path.

# Export PyTorch model to ONNX format
onnx_path = weights_path.with_suffix('.onnx')
if not onnx_path.parent.exists():
    onnx_path.parent.mkdir()

with warnings.catch_warnings():
    warnings.filterwarnings("ignore")
    if not onnx_path.exists():
     # Create a dummy input to match the input dimensions of the model 
        dummy_input = torch.randn(1, 3, IMAGE_HEIGHT, IMAGE_WIDTH)
        torch.onnx.export(
            model,
            dummy_input,
            onnx_path,
        )
        print(f"ONNX model exported to {onnx_path}.")
    else:
        print(f"ONNX model {onnx_path} already exists.")

Loading ONNX Model with OpenVINO Runtime and Optimizing it:

This part of the code initializes the OpenVINO’s Core object and reads the ONNX model using core.read_model(). It then compiles the model for a specific device (CPU in this case) using core.compile_model(). Finally, it calls the optimize_model() function to convert the ONNX model to OpenVINO’s Intermediate Representation (IR) format.

# Initialize OpenVINO and load the ONNX model using OpenVINO Runtime
core = Core()
# Read ONNX model with OpenVINO Runtime
model_onnx = core.read_model(model=onnx_path)

# ompile the ONNX model for a specific device (CPU) using OpenVINO
compiled_model_onnx = core.compile_model(model=model_onnx, device_name="CPU")

# Optimize the model by converting it to IR
optimize_model(str(onnx_path), onnx_path.stem, model_format='onnx', device_name="CPU")

Once your model is compiled, you can perform inference like this:

Show Results

Now, let’s confirm that the segmentation results look as expected. We’ll infer the optimized ONNX model. Before proceeding, let’s load and preprocess an input image.

Normalize the Input Image

First, we need to normalize the input image to match the mean and standard deviation used during training for the models.

image_filename = download_file(
    "https://storage.openvinotoolkit.org/repositories/openvino_notebooks/data/data/image/coco_hollywood.jpg",
    directory="data"
)
image_filename = "./data/coco_hollywood.jpg"
image = cv2.cvtColor(cv2.imread(image_filename), cv2.COLOR_BGR2RGB)

# Resize the image to the desired input shape
resized_image = cv2.resize(image, (IMAGE_WIDTH, IMAGE_HEIGHT))


# Normalize the resized image
normalized_image = normalize(resized_image)

# Convert the resized images to network input shape.
normalized_input_image = np.expand_dims(np.transpose(normalized_image, (2, 0, 1)), 0)

Run Inference on the Input Image using ONNX Model

After preprocessing the input image, we can now run inference on the image using the ONNX model (`compiled_model_onnx`). This will give us the segmentation results (`res_onnx`).

# Run inference on the input image 

res_onnx = compiled_model_onnx([normalized_input_image])[0]

Visualize the Segmentation Results

To visualize the segmentation results, we define VOC labels that map the predicted class indices to their corresponding names and colors. Then, we convert the segmentation results to a colormap image and display it alongside the original image.

The model is pre-trained on the MS COCO dataset and has been trained on 20 classes from the PASCAL VOC dataset for segmentation. The classes include background, airplane, bicycle, bird, boat, bottle, bus, car, cat, chair, cow, dining table, dog, horse, motorbike, person, potted plant, sheep, sofa, train, tv monitor. You have the flexibility to add more VOC labels as needed.

# Define the label mappings as a dictionary
voc_labels = {
    0: {"color": (0, 0, 0), "name": "background"},
    12: {"color": (64, 0, 128), "name": "dog"},
    15: {"color": (192, 128, 128), "name": "person"}
    # Add more VOC labels here as needed
}

# Convert the network result to a segmentation map
result_mask_onnx = np.squeeze(np.argmax(res_onnx, axis=1)).astype(np.uint8)

# Define the colormap for the classes
colormap = np.zeros((256, 3), dtype=np.uint8)
for class_index, label in voc_labels.items():
    colormap[class_index] = label["color"]

# Convert the segmentation map to a colored image using the defined colormap
colored_mask = colormap[result_mask_onnx]

# Display the result
viz_result_image(image, colored_mask, resize=True)

By following these steps for each type of model, you can easily convert your deep learning models to OpenVINO’s Intermediate Representation (IR) format. Now, you’re ready to leverage the power of OpenVINO™ 2023.0 to supercharge your deep learning models!

To learn more about OpenVINO™ new releases and more, we invite you to explore the OpenVINO DevCon Webinar Series.

Ready to experience the enhancements in OpenVINO™ 2023.0? Download the latest version and firsthand experience the improved integration with various deep learning frameworks. Download here!

Stay tuned for more updates and enhancements as OpenVINO™ continues to evolve, providing cutting-edge tools and technologies for deep learning model deployment and optimization.