Adapting the YOLOv7 Model to Ascend Processors

Hüseyin Çayırlı
Huawei Developers
Published in
6 min readJul 14, 2023
Ascend 310 Inference Processor

Introduction

Hello! In this article, we will talk about YOLOv7, one of the best deep learning-based object detection models, and how to perform inference on Ascend processors.

Let’s get started!

Huawei’s Ascend processors are specially designed artificial intelligence processors. The Da Vinci Architecture is a structure used to enable AI models to be trained faster and more efficiently. Ascend processors provide high performance in artificial intelligence applications by rapidly processing multidimensional matrices. You can find detailed information in the article World of Huawei Ascend: Future with NPUs.

In today’s world, state-of-the-art AI-based object detection models achieve high success, but they require high processing power capacity to be used in real-time applications. Ascend processors can provide the necessary processing capacity to run object detection models in real-time.

Below is an image of the Atlas 300I Pro Inference Card, a device on which you can develop an inference application using YOLOv7. Detailed information about the Atlas 300I Pro Inference Card and other Atlas devices can be found in the article titled 👨‍💻Ascend Full-Stack AI Software/Hardware Platform.

Atlas 300I Pro Inference Card

YOLOv7 is a popular model for real-time object detection. It was added to the YOLO (You Only Look Once) family in July 2022. According to the YOLOv7 paper, it is a fast and high-performance real-time object detection model.

The YOLOv7 model is inspired by YOLOv4, Scaled YOLOv4, and YOLO-R models, incorporating various changes and improvements. These include:

  1. Extended Efficient Layer Aggregation (ELAN) Networks: ELAN focuses on designing efficient architectures based on parameters, computations, and computation intensity.
  2. Concatenation-Based Approach for Model Scaling: Model scaling adjusts the model’s features to create models at different scales for different inference speeds.
  3. Trainable Bag-of-Freebies: These are techniques used to improve model performance without increasing training cost. In the YOLOv7 model, RepConv is introduced as a convolution block similar to ResNet but with an additional 1x1 filter connection. Performance improvement is achieved with RepConv without increasing training cost.
  4. Deep Supervision: It involves adding an auxiliary output layer to the middle layers of a network to enhance performance. This method significantly improves the model’s performance in various tasks. The YOLOv7 architecture includes a leading output layer responsible for the final output and an auxiliary output layer for training assistance.

Compared to other YOLO models, YOLOv7 offers higher speed and accuracy. YOLOv7 models can operate at different speeds ranging from 5 FPS to 160 FPS and outperform other models in terms of average precision (mAP) on the COCO dataset.

YOLO Models Comparison Chart

To run the YOLOv7 model on Ascend processors, you need to perform some preprocessing steps. ATC and ACL are tools used during these preprocessing steps and during inference.

ATC is a tensor compiler that runs on processors with Ascend architecture. It is used to convert artificial neural network models to offline models supported by Ascend AI processors. ATC supports popular model formats and performs optimization operations during the conversion process. Additionally, it supports half-precision computation, providing high performance with reduced memory usage. As a result, ATC is a powerful tool that easily performs the necessary conversion and optimization operations for Ascend AI processors.

ACL (Ascend Computing Language) is a collection of interface function libraries developed for Ascend processors. It provides users with the ability to create inference code and includes various functions. It supports C++ and C languages and can be used for deep neural network applications. PyACL, which is a Python-based version, facilitates the use of ACL functions in the Python language. ACL provides the tools and functions required to work effectively and efficiently with Ascend processors, enabling users to run models on these processors.

For detailed information about ATC and ACL, you can refer to the article titled 👨🏼‍💻Core Components of Ascend Processors.

To run the YOLOv7 model on Ascend processors, you first need to convert the model using ATC and then perform inference using ACL. Now let’s examine these steps in more detail.

Running the YOLOv7 Model on Ascend Processors

  1. First, we need to convert the model with using ATC. ATC is used to convert our model to an offline model supported by the Ascend processor. The ATC command supports popular model formats such as Caffe, MindSpore, Tensorflow, and ONNX. Therefore, the YOLOv7 model in PyTorch format is converted to ONNX. We follow the steps below to obtain the YOLOv7 model in ONNX format. After providing the necessary dependencies for the prepared virtual environment, we will obtain our ONNX model by giving appropriate parameters.

2. After the ONNX conversion is completed, we can convert the obtained yolov7.onnx model to the offline model (.om format) using ATC. When using the ATC command, we must write in the given input parameters appropriately. Input and output model names, as well as the Ascend processor type (Ascend310 or Ascend910) to be operated. We need to specify the input model framework. In the example given below by the atc command, the value of 5 as the framework input indicates that the framework of the model is ONNX. After running this command in the terminal, we will get the offline model (.om) that we will run on the Ascend processor.

3. Now we can run our model on the Ascend processor. The pyACL library is used for inference with the offline model. We use the ACLLite library, which uses pyACL to get output from the model and facilitates inference application development on Ascend devices. Additionally, the necessary functions for inputs and outputs are added.

4. Before running the model with ACL, we need to define the file paths for the model and inputs.

5. The YOLOv7 model is defined using ACLlite. The device ID of the Ascend processor to be used as input and the file path of the model in the .om format are provided.

6. After reading the input image using cv2, preprocessing is performed to make it suitable for the YOLOv7 model. Once the input image is processed by the preprocessing function, it is ready for the model.

7. The input image is fed into the model. We obtain the outputs using the execute function in the ACLlite library. Postprocessing is applied to the model outputs to obtain bounding boxes.

8. Using the obtained bounding boxes, you can detect objects and visualize the results.

You can access the source code for the implementation of this work from the provided link.

--

--