EdgeAI Performance Test on A12, A13 and Jetson TX2

Raymond Wong
3 min readJan 15, 2020

--

Large crowd tracking is still a challenging AI topic.

Why is it important to be very fast?

The performance of object tracking highly depends on the performance of object detection and the quality of the algorithm. Also, higher resolution images require a more efficient object detection algorithm. With a powerful edge processor, we can achieve higher quality tracking results.

An early result of the real-time EdgeAI object tracking algorithm using Apple A12 achieving around 10 FPS.

A13 is a clear winner

We carried out a set of performance tests on 3 processors namely Apple A12 Bionic, A13 Bionic, and NVIDIA Pascal. These processors are used in EdgeAI to perform intensive neural-network-based machine learning algorithms. We concluded that A13 Bionic has the highest performance which is about 33% faster than A12 Bionic and 1020% faster than NVIDIA Pascal.

Test Specification

We used our EdgeAI object detection algorithm as the benchmark program to measure the overall performance. The algorithm uses 100+ hidden layers to predict the objects and the bounding boxes with an input image.

In the first test, we spawned 500 requests in each which uploads an 460x460px image of 40 Kbytes and recognizes the object name and position. Requests are submitted with a parallelism of 16 requests.

In the second test, we also spawned 500 requests in each which recognizes the object name and position of the uploaded images and removes the image. Requests are submitted with a parallelism of 16 requests.

In each test, we measure the time to complete 500 requests and calculate the time to process one single image. The result is translated to the metric of frames per second (called FPS). The higher the FPS is, the higher the performance is.

Processor Specification

To understand the speed of the neural processing engine, we study the processor specification. As far as we know, the compiler or interpreter may utilize CPU, GPU and AI Accelerator to improve the overall performance.

Performance Result

FPS is measured on various devices.

From the measurement, it is clear that A12X and A13 are comparable in performance. A12 is slightly lower in performance. 256-CUDA NVIDIA Pascal GPU is the worst in performance.

Way Forward

The current test format depends on the efficiency of the EdgeAI API protocol which might not be well optimized at the moment. The protocol will further be fine-tuned to achieve better performance results. Non-edge GPU/NPU can also be selected for comparison.

--

--