Artificial Intelligence

AI & HPC Everywhere

Author: Pradeep Dubey, Intel Senior Fellow and director of the Parallel Computing Lab

Published in

Intel Tech

4 min readNov 23, 2021

Today across our customer base we are witnessing AI becoming part of so many applications and workflows. Specifically, we’re watching the evolution of AI and HPC convergence. AI technologies are playing an increasingly important role in traditional HPC modeling and simulation applications. At the same time, AI applications are getting large enough that they require HPC technologies.

There’s a large and growing demand for deep learning training as more businesses look for ways to leverage ever more data in their applications. Models are getting bigger and more complex in the drive to improve their accuracy and usefulness. Training these models is done with many iterations on more data and with greater frequency, which is driving exponential consumption of compute and increasing costs.

oneAPI

Many can get the AI performance they need inside the application they’re running by using Intel Xeon Scalable Processors without the need for discrete accelerators. And we’re taking a generational leap forward in that category with our next product (codenamed Sapphire Rapids). We’re delighted that on an early stepping of the Sapphire Rapids silicon, using a custom software stack for hardware capability testing, we’re able to achieve deep learning inferencing throughput for Resnet-50 which is highly competitive with the Nvidia A30 GPU, but using a general purpose CPU.[1]

But the future of advanced computing requires heterogeneous hardware to maximize the computing power needed for exascale-class workloads. That’s why Intel is investing in every kind of compute for AI, from CPU to GPU to FPGA and dedicated accelerators — what we call our XPU platform (where X stands for the variable in the equation).

Our goal at Intel is to make AI more performant by providing the most breadth and depth of AI-enabled compute, from CPU to GPU to FPGA and dedicated accelerators, all on an open, standards-based unified software stack — oneAPI.

Codename: Ponte Vecchio

Ponte Vecchio (PVC) is the codename for Intel’s forthcoming GPU with industry-leading FLOPs and compute density to accelerate AI, HPC, and advanced analytics workloads.

This new microarchitecture is built for scalability and is designed to combine multiple process technologies — both internal and external — with advanced packaging technologies to uniquely tailor products to customer and market needs.

PVC A0 silicon is already providing greater than 45 TFLOPS FP32 throughput, greater than 5TB/s Memory Fabric bandwidth, and greater than 2 TB/s connectivity bandwidth.[2] Additionally, we showed that our early Ponte Vecchio silicon is already demonstrating industry-leading benchmark performance in both training and inferencing with a demo showing ResNet inference performance over 43,000 images/second.[3]

Ponte Vecchio will be released in 2022 for HPC and AI markets and I’m sure we’ll have more details to share in the coming months.

Aurora Technology for the Broader Community

We’re also working to increase the performance and bandwidth between the different pillars — mod/sim, AI/ML, and Big Data.

We’re providing these technologies in partnership with the Argonne National Laboratory to Aurora, one of the world’s first exascale computers. This system enables us and the broader community to hopefully solve some of the world’s most important and challenging problems.

At the recent Intel ON event, Robert Wisniewski, Intel’s Chief Architect for HPC, and Argonne National Laboratory’s Venkatram Vishwanath talked about how exascale science enabled by HPC, AI, and Big Data is fueling a variety of fields, such as understanding protein interactions in the SARS-CoV-2 viral genome, mapping the connectivity of the brain, and speeding up drug development and personalized medical treatments.

See the Results

These technologies are just part of a larger portfolio that Intel continues to deliver around AI. For example: Intel and AWS recently announced availability of a new EC2 instance type powered by Habana Gaudi AI processors, architected from the ground up to increase deep learning training efficiency in the cloud. The Habana team also announced a new AI training solution featuring the Supermicro X12 Gaudi AI Training Server with the DDN AI400X2 Storage system to help remove storage bottlenecks often found in traditional network-attached storage. And more news is on the way.

If you and your organization are interested in seeing what Intel can offer today, sign up for the Intel DevCloud to access the oneAPI Base, HPC, AI, and Rendering toolkits via a terminal or your browser, complete with training modules using Jupyter Notebooks.

Notices & Disclaimers

Performance varies by use, configuration, and other factors. Learn more at www.Intel.com/PerformanceIndex.

Performance results are based on testing as of dates shown in configurations and may not reflect all publicly available updates. See backup for configuration details. No product or component can be absolutely secure.

Your costs and results may vary.

Intel technologies may require enabled hardware, software, or service activation.

[1] intel.com/InnovationEventClaims

[2] intel.com/InnovationEventClaims

[3] intel.com/InnovationEventClaims