👨‍💻Ascend Full-Stack AI Software/Hardware Platform

Published in

Huawei Developers

8 min readFeb 9, 2023

Introduction

Hi, there! We are with you with our Huawei Ascend Full-Stack AI Platform article. In this article, we will talk about the structure, devices, and software of Huawei Ascend chips.

Applications with artificial intelligence have come across almost everywhere in recent years. We need to have the high computing power for the AI models used in such applications to be trained faster and produce results. Most people know that GPUs are indispensable infrastructures for artificial intelligence applications. What about the NPUs?

NPU (Neural Processing Unit) is a high computing power hardware developed by Huawei. Huawei’s Da Vinci Architecture makes a difference in artificial intelligence applications.

Huawei uses NPUs on its devices, which it calls the Atlas family. Before we talk about devices, let’s take a closer look at Da Vinci's Architecture.

Huawei Da Vinci Architecture

During the training of artificial intelligence models, multi-dimensional matrices undergo mathematical operations to train models and produce results. Huawei has developed Da Vinci Architecture with this in mind. With the use of cube computing units in more than 85% of the chip, a performance that makes a difference in artificial intelligence applications was captured. Huawei first released Ascend 910 and Ascend 310 chips based on the Da Vinci Architecture.

Atlas 200 AI Accelerator Module

The Atlas 200 AI Accelerator Module uses the Ascend 310 AI processor to run video analysis and artificial intelligence applications. The Ascend 310 chip in INT8 offers 22 TOPS computing power in calculations. In addition to its high calculation capacity, the working temperature is between -25°C and 80°C and it can work in extreme conditions. Low power consumption, compact dimensions, and operating temperature range make the Atlas 200 AI Accelerator Module one-to-one for “edge” scenarios.

Atlas 200 DK AI Developer Kit

Want to run AI applications with Ascend 310 NPU? Here Atlas 200 DK. This developer kit includes the Atlas 200 AI Accelerator module. Atlas 200 has a well-designed hardware environment to use the power of the AI Accelerator module. Provides peripheral interfaces to enable users to quickly and easily access powerful processing capacity. You can run your artificial intelligence models with Ascend software infrastructure with easy access to environmental interfaces. You can develop artificial intelligence applications with this device by using the MindStudio IDE developed by Huawei.

Atlas 300I Inference Card

If you need more computing power for inference scenarios, the Atlas 300I Inference Card can meet your needs. This card was developed for those who are looking for strong inference scenarios. It provides 88 TOPS calculation capacity in INT8 calculations with a single card and supports real-time analysis of 80-channel HD videos (1080p 25 FPS). It has high capacity and high bandwidth memory. In this way, it can reduce the latency of artificial intelligence applications.

Atlas 300I Pro Inference Card

With Atlas 300I Pro Inference Card, we can get 140 TOPS calculation capacity in INT8 calculations. Standing out with advantages like superior computing power, ultra-high energy efficiency, high-performance feature acquisition, and secure boot, this inference card is a good choice for use in high-power AI applications.

Atlas 300V Pro Video Analysis Card

Atlas 300V Pro can meet your needs if you need to check video groups in your projects about video analysis. The general-purpose processor integrates AI Core and codecs, making it perfect for AI extraction and video/image encoding and decoding tasks. Atlas 300V Pro supports real-time analysis of 128-channel HD videos (1080p 25 FPS).

Atlas 300T Training Card

If you need to train your artificial intelligence model at high speed, Atlas 300T Training Card can meet your need. Supported by the Ascend 910 AI NPU, this card has the world’s largest single-chip computing capacity. With CANN software solutions, you can train your models with high performance in artificial intelligence model training. With a single card, is able to offer 280 TOPS FP16 computing power in calculations. It offers high performance during model training and gradient synchronization with its high-capacity memory.

Atlas 500 AI Edge Station

Need an “edge” device for harsh conditions? Atlas 500 AI Edge Station is designed for edge applications with harsh environmental conditions. It is a device developed for a wide range of edge application scenarios. Despite its fanless design, it has an outdoor operating range of -40°C to +70°C. It has 22 TOPS INT8 computing power in calculations. This device, which also provides cloud-edge cooperation with the software provided, makes this device a good choice in edge scenarios.

Atlas 500 AI Edge Station (Intelligent Edge System)

Atlas 500 Pro AI Edge Server

Atlas 500 Pro AI Edge Server is designed for edge applications. It has superior computing performance, strong environmental compatibility, easy maintenance, and cloud-edge collaboration. Supports up to 4 Atlas 300I Inference cards to meet inference needs in multiple scenarios. In this way, it has the capacity to analyze 320 channels of real-time HD (1080p 25 FPS) video. With the Kunpeng920 processor inside, it appears as a server device that can easily run applications that will push the limits.

Atlas 800 Inference Server (Model 3000 — Model 3010)

If you want to increase the computing power capacity, you can increase your computing power with Atlas 800 Inference Server. It is widely used for AI inference in data centers. This inference server has 2 models. Model 3000 and model 3010. Model 3000 includes an arm-based Kunpeng920 CPU. The 3010 model includes an x86-based Intel Xeon CPU. With its design structure that can support 8 Atlas 300I Inference cards, it allows 640-channel real-time HD (1080p 25 FPS) video analysis. The 3000 model of the server takes advantage of Kunpeng’s multi-core, low consumption, providing a highly efficient AI computing platform for inference scenarios.

Atlas 800 Training Server (Model 9000 — Model 9010)

We know about the 300T training card and we know how high its computational capacity is. If we need more computing power for model training, the Atlas 800 Training server will be a good choice. As with the inference server, this server has two models. Model 9000 and model 9010. Model 9000 is powered by Kunpeng920 arm-based processor and Ascend 910 NPU. Model 9010 is powered by an Intel Xeon Gold Series processor and Ascend 910 NPU. Atlas 800 Training Server has 8 Ascend 910 NPUs to deliver true performance. It has high computational density, high energy efficiency, and high network bandwidth.

Atlas 900 PoD

Atlas 900 PoD is a core unit of the AI training cluster based on Huawei Ascend 910 NPUs and Kunpeng920 processors. It has powerful AI computing capability, optimal AI energy efficiency, and optimal AI scalability. Atlas 900 PoD has the capacity to support up to 4096 Ascend 910 processors. With the maximum capacity of Atlas 900 PoD, you can reach 1 EFLOPS processing power in FP16 calculations.

At this point, we have basic information about Huawei Atlas devices. As we know, we need to control the high computational capacity that we have obtained as hardware with software. Huawei has revealed the Heterogeneous AI Computing Architecture in order to efficiently use the powerful hardware infrastructure it has designed.

Heterogeneous Compute Architecture

If we have a strong computational capacity in hardware, we need to control this power as well. If we cannot control the high performance provided by the hardware, the devices we use will not work with full efficiency. At this point, Huawei developed the Heterogeneous Computing Architecture and named this software as CANN (Compute Architecture for Neural Networks). CANN; It adopts core components like ACL, DVPP, and HCCL that unlock the computing power of Ascend NPUs.

ACL (Ascend Computing Language) is a unified programming interface that separates software from hardware, HCCL (Huawei Collective Communication Library) enables effective data transfer between Ascend AI processors in distributed training scenarios, and DVPP (Digital Vision Pre-Processing) is hardware to improve parallelism during image preprocessing. uses acceleration.

Supported AI Frameworks

We have powerful computing hardware and software to control the power of Ascend Atlas products. But we need some artificial intelligence infrastructure to work with neural networks. Huawei Atlas Ascend platforms support 3rd party AI frameworks such as Tensorflow, and Pytorch. Also, Huawei has its own AI infrastructure called MindSpore. MindSpore is a deep learning infrastructure aimed at easy development, efficient implementation, and covering all scenarios.

Result

In this article, we had preliminary information about Ascend Atlas products and software developed by Huawei. You can visit the link for more information.

See you in future articles :)

👨‍💻Ascend Full-Stack AI Software/Hardware Platform

Introduction

Huawei Da Vinci Architecture

Atlas 200 AI Accelerator Module

Atlas 200 DK AI Developer Kit

Atlas 300I Inference Card

Atlas 300V Pro Video Analysis Card

Atlas 300T Training Card

Atlas 500 AI Edge Station

Atlas 500 Pro AI Edge Server

Atlas 800 Training Server (Model 9000 — Model 9010)

Atlas 900 PoD

Heterogeneous Compute Architecture

Result

References

CANN

Products Page

Open Source @ Huawei

Hiascend

MindSpore

Ascend Documentation

Ascend Computing Support, Docs & Downloads

Written by Serkan Celik