Introduction of pyACL API

Serkan Celik
Huawei Developers
Published in
5 min readFeb 8, 2023
Image Reference

Introduction

Hi there! In this article to the developers; We will guide them to develop deep neural network (DNN) applications based on existing models and Python APIs provided by the Python Ascend Computing Language (pyACL).

Python Ascend Computing Language (pyACL) is a Python API library encapsulated using CPython based on AscendCL. Users can use Python to manage the running and resources of the Ascend AI Processors ranging from device management, context management, stream management, and memory management, to model loading and execution, operator loading and execution, and media data processing.

Let’s look deeply at device, stream, and context terms before the example.

Image Reference

Development Workflow

  1. Preparing the development environment and operating environment.
  2. Analyzing development requirements: Determine the functions (such as data copy and model inference), commands, and APIs for specific scenario-based development.
  3. Create a code directory: Create a directory to store code files, build scripts, test images, and model files. An example directory structure:

Develop an app:

  1. pyACL initialization, see pyACL Initialization and Deinitialization.
  2. Allocate runtime resources, see Runtime Resource Allocation and Deallocation.
  3. Transfer data by referring to Data Copy.
  4. (Optional) If image cropping and resizing are needed, add data preprocessing to output YUV420SP images as the input of model inference, see Media Data Processing V1.
  5. Perform model inference by referring to Basic Scenarios for Model Inference.
  6. After data processing is complete, destroy runtime allocations, see Runtime Resource Allocation and Deallocation.
  7. Deinitialize pyACL, see pyACL Initialization and Deinitialization.

Typical API Call Sequence

Image Reference

Let’s take the inference scenario of a model; We need to define the pyACL library to control the NPU (Neural Processing Unit) device. In the second stage, we have to allocate study resources. After allocating the work resources, we can move on to the inference phase of the model we will use.

We have to load the data to be inferred to the device memory. For this, we can allocate space in the device memory and load the data. After loading the data we can send the data to our model. At the same time, we need to load our model that will perform the inference process to the device memory.

After our data and model are loaded into NPU memory, we can run our model using pyACL API commands. After the execution of the model is finished, we can get the outputs as NumPy objects with pyACL.

After all the stages are finished, we can end our process by releasing the allocated resources. Let’s make an example with the ResNet-50 model for better understanding.

pyACL Initialization & Runtime Resource Allocation

To get started, we first need to define the class variables. We have to define the device ID and model path for the initialization process.

Let’s look deeply at Net class.

As you see we have defined our device ID and model path to class variables. Let’s take a look at the init_resource() function for understanding Initialization & Runtime Resource Allocation stages.

When using the pyACL APIs to develop an application, you must initialize pyACL first. Otherwise, errors may occur during the initialization of internal system resources, causing exceptions in other services. The acl.init() function returns a ret code for checking to initialize status. We can check the status with these ret codes.

After initialization, we need to allocate the device, context, and, model in sequence to support the execution of computing and management tasks.

Data Transfer

We are allocated runtime resources for inference task. At this point, we need to prepare the data for executing the model.

We sent our image to the allocated NPU device for executing the model. Let’s look at the run() function for understanding what's happening there.

As you see we have 3 main stages in the run() function. The first stage is sending our data to the NPU device. In the second stage, we are executing our model. After these, we are taking our outputs from the NPU device.

We should look more deeply for understanding how it works. What’s happening in the _data_interaction() function? Let’s look.

Data transfer starts in the for loop. We use the bytes_to_ptr() function to transfer images to the NPU device. On these lines, we convert our input images to bytes format, then convert these bytes type data to the pointer. At this point, we have made it possible to transfer all our data to the NPU device. We can transfer the data to the NPU device using the acl.rt.memcpy() function.

At this stage, we loaded our data to the NPU device. We should execute our model for taking inference results. Let's look forward() function.

We executed our model successfully. We need to destroy the data buffer that we created.

After running the model, we need to take our results from the NPU device. For this purpose, let's look _data_from_device_to_host() function.

We can take our results with the get_result() function. At this stage, all execution stages ended. We should destroy all allocated resources if we don’t use them.

pyACL Deinitialization & Runtime Resource Deallocation

After all required pyACL APIs have been called or before we exit the process, we must call the deinitialization commands to remove the pyACL.

Result

We learned how to manage NPU resources with pyACL, and how to perform and use inference scenario with Resnet example. If you want to check out more advanced examples and learn more about using pyACL and its features, you can check this link.

See you in future articles :)

Image Reference

Reference

--

--