Advanced Feature of pyACL — DVPP

Published in

Huawei Developers

6 min readDec 1, 2023

Introduction

Hi! In this article, we will examine what we can do more with the pyACL API. The subject we will consider will be the data preprocessing stages. The data preprocessing stage is an important point of artificial intelligence models. It is widely used during the training of models, during inference scenarios of trained models, and in the development of applications.

If you want to start with the basics of pyACL, how it works, application flow charts, allocation of “NPU” resources, and simple examples, you can take a look at the Introduction of pyACL API article.

Typical Scenarios

The resolution and format of the source image or video can be processed to meet the model requirements. The following is an example of a typical scenario.

Video decoding and resizing

Let’s imagine we have a YOLOv3 model for the object detection task. The input video is in H.264/H.265 encoding format and the resolution is 1920x1080. The YOLOv3 model requires an RGB or YUV input image with a resolution of 416x416. In this case, we should process the video as follows.

Image decoding, resizing, and format conversion

Let’s imagine we have a ResNet-50 model for the classification task. The input image is in JPEG format and the resolution is 1280x720. The ResNet-50 model requires an RGB input image with a resolution of 224x224. In this case, we should process the image as follows.

Image cropping, resizing, and format conversion

Let’s imagine we have a ResNet-50 model for the classification task. The input image is in YUV420SP format and the resolution is 1280x720. The ResNet-50 model requires an RGB input image with a resolution of 224x224. In this case, we should process the image as follows.

We can give more examples of this process. As we can see, the data preprocessing stage is important for AI models. Because if we want to make inference processes for AI models we need to meet the needs of model input.

Data Processing Modes

Ascend CANN provides two image/video data processing modes these modes namely DVPP & AIPP. DVPP meaning is Digital Vision Pre-Processing and AIPP meaning is Artificial Intelligence Pre-Processing. AIPP and DVPP can be used separately or together. In combined applications, DVPP is used first to decode, crop, and resize images or videos. However, due to DVPP hardware restrictions, the image format and resolution after DVPP may not meet the model requirements. Therefore AIPP is required to further perform color space conversion (CSC), image cropping, and border making.

Function Support

pyACL provides different media data processing APIs. The types of functions provided are as follows;

Let’s give an example of what we can do with VPC. First, we can start by talking about the API call sequence. As an example, let’s consider a scenario where we do the cropping and resizing before sending our input image to our model inputs.

As we see in the call sequence chart, we can start with creating a channel for starting image data processing. After creating the channel we need to create the ROI position configuration, and resizing configuration, respectively.

Before starting the cropping and resizing stage, we need to allocate buffers for storing input or output data. For this reason, we need to allocate device buffers for every image file. At this point, we are ready to perform cropping and resizing.

About the cropping stage, we can call asynchronous API to crop selected ROI from an input image. Another process that we can perform is the cropping and pasting process. We can call the dvpp_vpc_crop_and_paste function for making a crop a selected ROI from the input image, and paste the crop ROI to the specified area in the target image as the output image.

About the resizing stage, we can call the asynchronous API dvpp_vpc_resize_async function for resizing the input image. We shouldn’t forget to call the synchronize_stream function until all tasks in the specified stream are complete.

After finishing the cropping, and resizing processes, we need to deallocate the buffers that we allocated in the first stage. We can call the dvpp_free function for deallocating resources.

Lastly, we need to delete ROI configurations and resize configurations respectively. After deleting configurations we can delete channel and channel description information too.

To clarify the whole process, let’s look at a simple example of cropping with DVPP. In this way, we can understand better the whole process.

Cropping Example for DVPP

Firstly we need to call acl library to start the cropping process.

We need to allocate the runtime resources. Respectively we need to allocate device, context, and stream.

As we mentioned in the development chart section, we need to define crop area configurations. In this stage, we are defining the ROI of the crop area.

Also, we need to create a data processing channel. With the dvpp_create_channel_desc() function we can create this processing channel.

According to the data processing chart; the next stage should be creating input/output image descriptions. We should create our input and output image descriptions after creating channel descriptions.

After defining every stage, we can start the cropping process using DVPP. We should call the dvpp_vpc_crop_async() function first after that we need to synchronize the process. For this purpose, we can call the synchronize_stream() function.

At this part, we need to destroy the allocated resources that we allocated. Respectively, input/output descriptions of the image, buffer resources, channel, and channel descriptions should be removed from memory. We can control the process with ret codes. If the ret codes take the “0” value after destroying processes, it means our process has ended successfully.

Similarly, with destroying description information, we need to deallocate stream, context, and device resources respectively.

In the last part, we need to finalize the “acl” sources.

End of this pre-processing flow we can get our pre-processed image data. We can use our data for the model inference stage.

Result

We learned the basics of the Digital Vision Pre-Processing feature with pyACL API. We can use these asynchronous DVPP functions pre-processing stages on NPU devices. Using DVPP functions will able to us get faster pre-processing times on NPU devices. For more details, you can check the references section.

See you in the next articles :)