The world of supercomputing has entered a new era with oneAPI

Gal Oren
5 min readJan 10, 2023

--

The journey to speed up computer performance continues with the understanding that the shift in achieving performance is now moving from hardware to software. The new Intel oneAPI Center of Excellence at the Technion will help train the next generation of high-performance computing researchers and developers by expanding the use of cross-architecture programming in teaching, research, and development.

Aurora Supercomputer installation at Argonne National Laboratory. Credit: Intel

In June 2022, after more than a decade of anticipation, the world of supercomputing has entered a new era: the Frontier computer inaugurated at the National Laboratory Oak Ridge in the USA passed the complicated HPL equation solutions test and entered the top of the list of the most powerful computers in the world (TOP500), with a uniform aggregate calculation power of 1.102 exa (10 to the 18th power, or 1,000,000,000,000,000,000) floating point operation per second. The computer is used for scientific breakthroughs and the development of cutting-edge technologies in almost every field where a computer can help — from understanding clean energy mechanisms through drug discovery to innovative materials engineering. The next computer to be inaugurated on a similar scale will be the Aurora computer at the Argonne National Laboratory in the USA, with an estimated power of 2 times more than Frontier. The well-founded belief is that the Chinese also already have such computers.

The road to this unbelievable peak was achieved with great effort, unlike the supercomputers of previous generations: during the first 25 years of the era of High-Performance Computing, the growth of computing power was achieved almost exclusively based on the exponential increase in the number of operations that a single processor is capable of performing, and the fulfillment of the technological predictions known as the very well know Moore’s law, and the Dennard’s law, which is less well known. Dennard predicted in 1974 that as transistors get smaller, their power density will remain constant. As a result, it will be possible to increase clock frequencies, execute more commands, and increase performance continuously. In addition, in 1994, another ability to increase performance was created when it was relatively easy to use the aggregate power of these processors by means of message-passing interfaces between them. Thus, for example, if a single processor with a certain amount of memory could be populated with a coarse-resolution climate calculation, the same calculation could be performed (and relatively at almost the same time) with a finer resolution or for a much larger area.

However, Dennard’s ‘law’ ceased to exist in 2005 due to physical constraints and led to a situation where even though Moore’s ‘law’ does not stop (every few years, the amount of transistors doubles), the performance does not increase to the same extent, and the energy growth was not sustainable in a single processor. Accordingly, and in order to preserve the overall increase in performance, the chip industry focused on new hardware architectures while improving the software schemes to fit that dedicated hardware, which of course, became more and more complex. One of the most important changes in this context was the transition from single to multi-core processors. Within a few years, the industry even introduced new hardware with a large number — hundreds and even thousands of cores, or Many-cores — with a lower frequency than a ‘normal’ computing core when the goal is to gain performance from the parallelism. Consequently, the era of accelerators and graphics processors for general processing purposes (GPGPUs) was born, which recorded incredible breakthroughs for the world of artificial intelligence in general and deep learning in particular. These advances have created a new era in which heterogeneous computing — a combination of several hardware architectures in one computer — is the main way to continue increasing performance. However, to utilize the architecture’s full potential, major and fundamental changes are now required in the calculations and the way they are coded. “Free lunch is over,” as the influential 2005 article headlined.

The understanding that the shift in achieving performance is now moving from the hardware to the software has permeated the entire industry, and recently Intel released in open source all the infrastructure required to work on such heterogeneous hardware, whether produced by itself or by another company. This infrastructure is called oneAPI. oneAPI is an open, unified, cross-architecture programming model for CPUs and accelerator architectures (GPGPUs, FPGAs, and others). Standards-based, the programming model simplifies software development and provides uncompromising performance for accelerated computing without proprietary lock-in while allowing integration of existing code. With oneAPI, developers can choose the best architecture for the specific problem they are trying to solve without having to rewrite the software for the next architecture and platform.

Together with two world-leading Israeli experts in parallel and distributed computing — Prof. Hagit Attiya from the Department of Computer Science at the Technion, and Prof. Danny Hendler from the Department of Computer Science at the Ben-Gurion University of the Negev — and in collaboration with Intel and the director of oneAPI in Israel Mr. Guy Tamir, we were pleased to recently announce the establishment of an oneAPI Center of Excellence at the Technion in order to enable studies and research in contemporary scientific computing using the power of processors and accelerators of various types with oneAPI as cross-architecture programming.

To develop the next generation of High-Performance Computing researchers and developers, we use cross-architecture programming using oneAPI using Intel’s oneAPI developer cloud (Intel DevCloud). The center is also expanding to other leading universities (such as Tel-Aviv and Ben-Gurion) through a comprehensive and new course that we developed (together with doctoral student Yehonatan Fridman, student ambassador of oneAPI), which covers the basic and advanced possibilities of using oneAPI and OpenMP for shared memory parallelism, especially with accelerators. The Center of Excellence is also exploring new opportunities to improve performance through new hardware that supports a shared memory architecture between processors and accelerators. These studies are a natural extension of Technion’s long-standing leadership in parallel computing. The center also promotes code projects (two already in open source), identifies and promotes open source HPC/AI applications through oneAPI using OpenMP, and optimizes their performance. In addition to expanding the code base that supports oneAPI, this approach prepares our students for the (already present) future of heterogeneous computing.

Dr. Gal Oren is a visiting scientist at the Henry & Marilyn Taub Faculty of Computer Science, Technion — Israel Institute of Technology.

--

--

Gal Oren

Computer scientist specialized in Scientific Computing