The Future of Computing for Data Science
From current to next decades technologies: Few examples
Classical computing has experienced remarkable progress guided by Moore’s Law and the Von Newman architecture. Moore’s law tells us that every two years, we double the number of transistors in a processor and at the same time we increase performance by two or reduce costs by two. This pace has slowed down over the past decade and we are currently seeing emerging technologies. We must rethink information technology (IT) and in particular move towards heterogeneous system architectures with specific accelerators in order to meet the need for performance. What is also interesting is how advances in theoretical science the last century helps us advance computing today. If we take the example of the data science workloads, many issues occur such as unbounded computational demands. Since 2012, OpenAI showed that training compute requirements has been increasing exponentially with a 3.4-month doubling time. The progress in computing was a key element for progressing AI.
AlexNet to AlphaGo Zero: A 300,000x Increase in Compute
From an emerging AI, we entered the broad AI era which is disruptive, pervasive, multi-task, multi-domain, multi-model. To address these challenges and deal with computing limitations, scientists are working on AI able to train with much less data (e.g., improved algorithms, Reduced-Precision) and find new ways to accelerate workloads (e.g., Analog Hardware, Quantum). The data center of tomorrow will certainly be made up of heterogeneous systems, which will run heterogeneous workloads. It will be equipped with binary, biologically inspired and quantum accelerators. These architectures will be the foundations to address next generation challenges. Like an orchestra conductor, the cloud in all its dimensions (Hybrid Cloud, Multi-Cloud, Distributed Cloud) will make it possible to set these systems to music also thanks to an AI layer for intelligent automation.
From Information Theory to Running Our Society
It doesn’t mean it’s the end of binary systems ! Claude Shannon (1916–2001) is an engineer, mathematician and one of the information theory’s father. He worked for 20 years at MIT and alongside his academic activities he worked at Bell Laboratories. The binary used by classical computers appeared in the middle of the 20th century, when mathematics and information were combined in a new way to form information theory, launching both the computer industry and telecommunications. The strength of binary lies in its simplicity and reliability. A bit is either zero or one, a state that can be easily measured, calculated, communicated or stored. Today’s computer and high-performance computers are based on this principle and running our society and economy.
Today we can easily give some examples of the incredible possibilities of binary systems. A first example is the IBM z15 capabilities with its chip engraved in 14 nm and containing 9.1 billion transistors. A single system is able to process 19 billion encrypted transactions per day and 1000 billion web transactions per day. The IBM z15 systems installed across the globe process 87% of bank card transactions and 8 trillion payments per year. It is possible to deploy 2.4 million docker containers in a single system. Its consolidation capacity consequently reduce significantly data center space, reduce costs and do significant energy savings always keeping the strengths of IBM Z such as security, reliability and resiliency. We can also cite one of the most powerful computers in the world, Summit. This computer helps model supernovas or new materials, explore solutions against cancer, study genetics and the environment. Summit is equipped with 9216 IBM Power9 CPUs, 27648 NVIDIA Tesla GPUs and delivers 200 petaflops (million billion), 25 Gigabytes per second between nodes, and has 250 petabytes of storage capacity. Even with this amazing cutting-edge technology, we cannot compute everything. We cannot store everything. Strong bottleneck exists and particularly with Von Newman based architectures.
Today, in the data science field, the need for computing power is compensated by deporting workloads to GPUs or FPGAs (Field Programmable Gate Arrays) which can realize the desired digital functions. But there is a need to find other ways to compute. We can find an inspiration through the progress in the field of neuroscience that will allow to design a new type of processor directly inspired by the way the brain transmits information. Thanks to Santiago Ramon y Cajal, a Spanish histologist and neuroscientist, Nobel Prize in physiology or medicine in 1906 with Camillo Golgi, we know better the architecture of the nervous system and the way neurons communicate with each other. The axon of a neuron transmits nerve impulses, action potential, to target cells.
In an article published in Science, IBM and its university partners have developed a new type of processor made up of a million in silico neurons. The chip consumes only 70 milliwatts and is capable of 46 billion synaptic operations per second, per watt, literally a synaptic supercomputer holding in a hand. Researchers have moved from neuroscience to supercomputers, a new computing architecture, a new programming language, algorithms, applications and now a new chip called TrueNorth. TrueNorth is what we call a neuromorphic CMOS integrated circuit. It is a manycore processor network, with 4096 cores, each having 256 programmable simulated neurons for a total of just over one million neurons. In turn, each neuron has 256 programmable synapses allowing the transport of signals. Therefore, the total number of programmable synapses is slightly more than 268 million. The number of basic transistors is 5.4 billion. Since memory, computation and communication are managed in each of the 4096 neurosynaptic cores, TrueNorth bypasses the bottleneck of the von Neumann architecture and is very energy efficient. It has a power density of 1 / 10,000 of conventional microprocessors. The neuromorphic computing market is expected to grow from USD 6.6 Million in 2020 to USD 272.9 Million by 2027.
DNA is also a source of inspiration. A team recently published in the journal Science the capacity to store digital data on a DNA molecule. They were able to store an operating system, a French film from 1895 (L’Arrivée d’un train à La Ciotat by Louis Lumière), a scientific article, a photo, a virus and a $ 50 gift card in DNA strands and retrieve the data without errors.
Indeed, a DNA molecule is intended to store information by nature. Genetic information is four nitrogenous bases that make up a DNA molecule (A, C, T, and G). Today it is possible to transcribe digital data into a new code. DNA sequencing then makes it possible to read the stored information. Encoding is automated through software. A DNA molecule is 3 billion nucleotides (nitrogenous base). In one gram of DNA, 215 petabytes of data can be stored. It would be possible to store all the data created by humans in one room. In addition, DNA can theoretically keep data in perfect condition for an extremely long time. Under ideal conditions, it is estimated that DNA could still be deciphered after several million years thanks to “longevity genes.” DNA can withstand the most extreme weather conditions. It’s still in the lab and certainly with a lot of challenges such as the high cost and processing times which can be extremely long but definitely a promising technology for storing data.
The laws of physics help us also to imagine the computers of tomorrow. Let’s start with the atom. In an article published in Nature, IBM physicists and engineers have described how they achieved the feat of writing and reading data in a Holmium atom, a rare-earth element. This is a symbolic step forward but proves that this approach works. That we might one day have atomic data storage. To compare what it means, imagine that we can store the entire iTunes library of 35 million songs on a device the size of a credit card. In the paper, the nanoscientists demonstrated the ability to read and write one bit of data on one atom. For comparison, today’s hard disk drives use 100 000 to one million atoms to store a single bit of information.
Of course, we cannot avoid discussing quantum computing. Quantum bits — or qubits — combine physics with information and are the basic units of a quantum computer. Quantum computers use qubits in a computational model based on the laws of quantum physics that will certainly be capable of solving problems of great complexity. Beyond this technological progress, quantum computing opens the way to the processing of computer tasks whose complexity is beyond the reach of our current computers. Quantum computers will be very different from today’s computers, not only in what they look like and how they are made, but, more importantly, in what they can do.
We can also quote a famous phrase from Rolf Landauer, a physicist who worked at IBM: “ Information is physical .” The computers are, of course, physical machines. It is therefore necessary to take into account the energy costs generated by calculations, recording and reading of the information bits as well as energy dissipation in the form of heat. In a context where the links between thermodynamics and information were the subject of many questions, Rolf Landauer , sought to determine the minimum amount of energy necessary to manipulate a single bit of information in a given physical system. There would therefore be a limit, today called the Landauer limit and discovered in 1961, which defines that any computer system is obliged to dissipate a minimum of heat and therefore consume a minimum of electricity. This research is fundamental because it shows that any computer system has a minimum thermal and electrical threshold that we cannot exceed. This will mean that we will reach the minimum consumption of a computer chip and that it will not be able to release less energy. It is not for now, but scientists explain that this limit will be especially present when designing quantum chips. Recent work by Charles Henry Bennett at IBM has consisted of a re-examination of the physical basis of information and the application of quantum physics to the problems of information flows. His work has played a major role in the development of an interconnection between physics and information.
In the applications we need to compute, there are easy problems, hard problems for classical computers and not suitable for a quantum computer and quantum possible for a real advantage such as factoring or simulating quantum mechanics and applications such as in chemistry, physics, materials discovery by performing quantum simulations, machine learning (classification), differential equations, network analysis by computing linear systems or optimization, search, collision finding, graph properties. We already can explore fundamental computational building blocks such as Quantum Kernels and Quantum Neural Networks (https://medium.com/qiskit/introducing-qiskit-machine-learning-5f06b6597526).
The future of computing will be built with heterogeneous systems made up of classical computing called binary or bit systems, biologically inspired computing and quantum computing called quantum or qubit systems. These heterogeneous components will be orchestrated and deployed by a cloud architecture that masks complexity while allowing the secure use and sharing of private and public systems and data. AI will also bring opportunities for progress and applied to the management and optimization of the entire data center of the future.
The Future of Computing: Bits + Neurons + Qubits, Dario Gil and William M. J. Green, https://arxiv.org/pdf/1911.08446.pdf
The data center of tomorrow is made up of heterogeneous accelerators
OpenAI, AI and Compute, https://openai.com/blog/ai-and-compute/#fn1
Natterer, F., Yang, K., Paul, W. et al. Reading and writing single-atom magnets. Nature543, 226–228 (2017). https://doi.org/10.1038/nature21371