Up until now, chip-makers have been piggybacking on the renowned Moore’s Law for delivering successive generations of chips that have more compute capabilities and are less power hungry. Now, these advancements are slowly coming to a halt. Researchers around the world are proposing alternative architectures to continue producing systems which are faster and more energy efficient. This article discusses those alternatives and reasons why one of them might have an edge over others in averting the chip design industry from getting stymied.
What is Moore’s Law and its Dusk
Moore’s law, or to put it differently — savior of chip-makers worldwide — was coined by Dr. Gordon Moore, the founder of Intel Corp, in 1965. The law states that the number of transistors on a chip would double every 2 years. But why the savior of chip-makers? This law was so powerful during the semiconductor boom that “people would auto-buy the next latest and greatest computer chip, with full confidence that it would be better than what they’ve got”, said former Intel engineer Robert P. Colwell. Back in the day writing a program with bad performance was not an issue as the programmer knew that Moore’s law would ultimately save him.
Problem that we are facing today is, the law is nearly dead! Or to avert from offending Moore fans — as Henry Samueli, chief technology officer for Broadcom says -
It’s graying, it’s aging, It’s not dead, but you’re going to have to sign Moore’s Law up for AARP.”
As we are approaching the atomic scale, it has become difficult for chip-makers to further downsize the transistors. The industry has already started to feel the heat. Intel, in 2015, announced that its new generation chip would now be released in every 2.5 years. It has also indicated that transistors may keep shrinking only for next 5 years.
So what is causing the long standing law to fail? It’s Quantum Mechanics, silly! Let’s delve a little deeper to understand the working of a processor. We all know that a processor understands only machine code. So, whatever it stores or processes is in the form of 1’s and 0’s. These states of 1 or 0 are retained by the logic gates which are in turn made up of transistors. The job of transistors is to regulate the flow of electrons (by creating a potential barrier) so as to hold a particular state in a logic gate. Now, as we go down to the scale of 1 nm = 10 atoms, it becomes difficult to regulate the electron flow. Even in the presence of a potential barrier, electron flow continues due to a phenomenon called Quantum Tunneling. As a result leakage current grows significantly making the architecture inefficient.
Researchers and companies are trying to come up with alternatives to avoid hitting rock bottom in the realm of computer architectures. Former head of the manufacturing group of Intel had suggested that the company will be adopting new materials and altering the structure of the transistor to give added control to the current flowing. With Deep Learning progressing and new and complex algorithms being developed, there is more and more demand for the chips that can perform heavy matrix computations efficiently.
Following new areas are being explored by researchers around the globe:
- Quantum Computing: It harnesses the ability of a subatomic particle to exist in more than 1 state at any given time. Unlike conventional bits which can store either a 0 or a 1, quantum bits (qubits) can store much more information. This implies a quantum computer can store a lot more information than a conventional one using less energy.
- Carbon Nanotubes: These are microscopic sheets of carbon rolled into cylinders and are being aggressively explored by IBM. In a paper published in journal Science, they described a new way of building transistors using carbon nanotubes which could be significantly smaller than silicon transistors we have today.
- Parallel Architectures: This approach has been widely used in the past decade to circumvent the performance barrier. Highly parallel architectures (GPU) are being developed to carry out simultaneous operations. Unlike Von Neumann architecture which executes the instructions serially on a single core, GPU’s have concurrent threads running on multiple cores to speed up the process considerably. Focus is also shifting towards energy efficient FPGA to replace GPU.
- Neuromorphic Hardware: It encompasses any electrical device which mimics the natural biological structures of our nervous system. The goal is to impart cognitive abilities to a machine by implementing neurons in silicon. Due to its much better energy efficiency and parallelism it is being considered as an alternative over conventional architectures and energy hungry GPUs.
Among the above mentioned areas, quantum computing and carbon nanotubes are still in elementary stages of development. They have still not been rendered viable to completely replace silicon let alone their commercial production. GPU’s have been in use for a long time now but they consume a lot of energy. Neuromorphic hardware is also relatively in intermediate stages of development but provides a highly probable solution to the upcoming performance crisis.
The human brain is the most energy efficient and the lowest latency system existing on Earth. It processes complex information faster and in a far better way than any computer. This is largely due to its architecture which consists of dense neurons transmitting signals through their synapses efficiently. Neuromorphic Engineering aims at realising this architecture and performance in silicon. The term was coined by Carver Mead in late 1980s describing systems containing analog/digital circuits to mimic neuro-biological elements present in nervous system. A lot of research facilities have been investing in developing chips that can do the same.
IBM’s neuromorphic chip — TrueNorth has 4096 cores each having 256 neurons and each neuron having 256 synapses to communicate with others. The architecture being very close to the brain, it is very efficient in energy. Similarly Intel’s Loihi boasts of 128 cores, each core having 1024 neurons. The APT group of University of Manchester recently revealed the world’s fastest supercomputer — SpiNNaker consisting only of neuromorphic cores. Brainchip is another company which is developing similar chips for applications in data center, cyber security and fin-tech. The Human Brain Project is a massive EU-funded project that’s investigating how to build new algorithms and computers that mimic the way the brain works
All of these systems have one thing in common — all are highly energy efficient.
TrueNorth draws 1/10,000th of the power density of a conventional Von Neumann processor.
This gigantic difference is because of the asynchronous nature of the on-chip processing, like a human brain. Each neuron need not be updated at every time step. Only the ones which are in action require power. This is called event-driven processing and is the most important aspect for rendering neuromorphic systems viable as a suitable alternative for conventional architectures.
Spiking Neural Network
The dense network of neurons interconnected by synapses on a neuromorphic chip is known as Spiking Neural Network. Neurons communicate with each other by transmitting impulses through synapses. The above mentioned chips realise this network in hardware but there is a huge emphasis on simulating it in software as well to evaluate the performance or solve the problems of pattern recognition and other applications of deep learning.
Spiking Neural Networks encode the information in temporal domain in the form of spike trains i.e the time difference between two consecutive spikes determine the properties in a network. The functioning of the most basic element of the network — a neuron, is governed by a differential equation. The input to a neuron is in the form of discrete spikes in time domain rather than continuous values. Due to these idiosyncrasies of an SNN the methodology used to train it is also different from the existing artificial neural networks. Instead of gradient descent, a more biologically plausible Hebbian Learning is used. It is also called Spike Time Dependent Plasticity (STDP).
This all might seem esoteric at first and takes time to get the hang of the network dynamics of an SNN. Since this domain is still in its elemental stage, the documentation available is also not thorough.
This series of blogs aims at developing an understanding of SNN from scratch with each element of the network explained in depth and implemented in Python. The existing libraries in Python for SNN will also be discussed.
Follow Computational Neuroscience to gain more insights in this interesting domain. Also, be my guest to contribute to this publication. Collaborations are welcomed.