NeuronLink: An Efficient Chip-to-Chip Interconnect for Large-Scale Neural Network Accelerators

Large-scale neural networks are deployed on the processors having multi-core structure and organized via network on chip for process the heavy neurons. These Neural network chips are connected via chip-to-chip interconnection networks to increase the efficiency of neuron processing. In th research, the author has proposed interchip and intrachip communication methods for neural network processors. The author has implemented four connected NoC based deep neural network chips with four FPGAs to test the proposed techniques. The research concludes that the proposed interconnection networks can manage the data traffic inside the deep neural networks very efficiently.

Almost every industry is adopting highly efficient deep neural networks to make their products intelligent. These DNNs has a lot of application such as image recognition, object detection, speech recognition which involves machine intelligence. These DNNs have too many staked layers of neurons which require a very high computation power. It becomes more challenging when it needs to deploy on hardware. A lot of optimization is required to run these DNNs. GPUs, CPUs, ASICs, and FPGAs can accelerate the processing of these DNNs. But using these platforms could be energy inefficient because these platforms offer high speed at the cost of high usage of resources. Network-on-chips (NoCs) offer an energy-efficient solution to process the DNNs. NoCs processes multiple neurons parallel and data can be efficiently interchanged from one neuron to another.

NoC-based design paradigm provides:

1. Energy efficiency — Reduces the off-chip memory access.

2. Scalability — Computation resources are independent of the data flow.

3. Flexibility — Handles different data flows through flexible interconnection.

In this blog, a lightweight and efficient chip-to-chip interconnection scheme and virtual band router optimization methods used for on-chip interconnection for NoCs are explained.


It is a chip-to-chip interconnect paradigm which includes both interchip and intrachip connections. The Architecture of NeuronLink is presented in Fig.1. It includes a physical layer, a data link layer, and a transaction layer that implements NoCs.

Fig. 1 Architecture of NeuronLink

First packets are received by the data link layer from the transaction layer. The header part of the packet contains packet priority, multicast type, and destination address. The body flits of the packet are then stored in virtual channels (VCs). Then the credit management (CRM) element selects the VC to be sent for further processing. Then the packet arrives at the physical layer. It receives the data and commands. For addressing the issue of high-priority commands, the asynchronous handshake approach is used. Then encoder adds the packet header and check header to the data. At the receiver side, the physical medium attachment layer processes the data. To synchronize the data from the recovery clock, the elastic buffer can be used. A command is analyzed by the CRM unit in the data link layer and the data is sent to VCs subject to address and priority.

Implementation of NeuronLink in DNN Accelerator:

Fig. 2 describes the general architecture of the DNN accelerator. This DNN accelerator contains 4 chips each consist of 16 processing nodes are connected with the NeuronLink interconnect scheme. Each chip consists of a PCIe interface for high bandwidth off-chip data transmission. Every processing node contains:

  1. eDRAM buffers — Stores input features.

2. FourDigital Processing Units — Performs shift and add pooling and activation operation.

3. 8 analog processing units — Performs Situ MAC operations.

Each analog processing unit (ALU) consists of different crossbar arrays, ADCs and DACs.

Fig. 2 Large Scale DNN Accelerator. (a) General Architecture. (b) Mapping inside the NoC based Chip. (c) The architecture of a single processing node. (d) The architecture of ALU

To accelerate the Resnet18 model, it should be mapped as shown in Fig. 3. Other models can also be mapped similarly for accelerating the network by using the proposed DNN accelerator.

Fig. 3 Mapping of Resnet18 model

The circular box represents routers whereas square boxes represent processing nodes. The number inside the boxes represents the number of layer in the resnet18 model. Arrows in the map show the transfer direction or movement of the data. The green circle shows the transfer of data from the local processing node to DRAM.

The proposed interconnect scheme, NeuronLink has low power consumption and simple circuit complexity compared with other interconnects. Also, the cost of implementation of NeuronLink is also low and it simplifies the data flow. NeuronLink based DNN accelerators have better power efficiency and area efficiency than the previous NoC based DNN accelerators such as Eyeriss-v2 and DaDianNao.


1. Shanlin Xiao, et. Al, “NeuronLink: An Efficient Chip-to-Chip Interconnect for Large-Scale Neural Network Accelerators,” in IEEE Transactions on Very Large Scale Integration (VLSI) Systems, Sept. 2020, vol. 28, no. 9, pp. 1966–1978.

2. K. He, X. Zhang, S. Ren, and J. Sun, “Identity mappings in deep residual networks,” in Proc. Eur. Conf. Comput. Vis. (ECCV), Oct. 2016, pp. 630–645.

3. M. F. Reza and P. Ampadu, “Energy-efficient and high-performance NoC architecture and mapping solution for deep neural networks,” in Proc. 13th IEEE/ACM Int. Symp. Netw.-Chip, Oct. 2019, pp. 1–8.




Neural Network Model Deployment on VLSI Devices

Recommended from Medium

The Best Path a DeepRacer Can Learn

Machine Learning: A brief overview — for beginners

German Traffic Sign Benchmark Model Building, Visualization, and Mobile Use

Understanding Black Box Algorithms

Machine Learning (I)

JellyZen — Unity ML Agents Reinforcement Learning for an Asymmetric Adversarial Simulation — Part 3

Visualization of a multi dimensional shaped reward

Detect suspicious behaviour on CCTV cameras

How I designed a system to detect sarcasm…

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store


More from Medium

Action Recognition Model using NVIDIA TAO on DeepStream-6.1

Fibodex News

It Ends with us

#EssentialEvolution | Part IV