Reimagining Machine Learning in hardware

Motivation

Machine Learning is a computational technique that allows machines to solve specific tasks without being explicitly programmed. In the recent past, it has provided the human society with means of practical image and speech recognition, self-driving cars, efficient web-search techniques, health predictions, etc. Typically these algorithms are implemented and deployed in software systems. However, with the advent of plethora of smart devices there is a pressing need to emulate such algorithms in hardware. A major challenge in obtaining performance comparable to a software implementation of a system is that statistical variations in VLSI devices diminish the accuracy of weights. The first algorithm we developed to combat this issue is termed as Liquid State Machine with Dendritically Enhanced Readout (we call our baby LSM-DER).

Architecture

LSM-DER is composed of an input (spikes or current) layer, a huge and sparsely interconnected pool of spiking neurons termed as the liquid and a memoryless readout. The liquid and the readout are composed of neurons with random and binary weights respectively. VLSI mismatch that is a curse for other systems can be exploited to generate the random weights. Moreover, binary weights of readout are more resistant to noise and mismatch than their high-resolution counterparts. Hence, LSM-DER is a hardware friendly architecture.

Architecture of Liquid State Machine with Dendritically Enhanced Readout (LSM-DER)

Learning Rule

The synapses in liquid have random weights and are typically not trained. On the other hand, the readout is trained in a task-specific manner. Since binary synapses are considered in the readout, learning happens through formation or elimination of synapses. Hence, the learning involves network rewiring (NRW) of the readout network similar to structural plasticity observed in biological neural networks. The key point of the learning algorithm is removing a synapse which is contributing most to the classification errors and replacing it with a new synapse formed from a different afferent line.

A toy animation depicting structural plasticity — a connection based learning rule

Task I: Classification

Machine Learning algorithms are usually employed to solve two types of task: classification and regression. Hence, we evaluate our algorithm by considering one of each type. The first benchmark task we have considered is the Spike Train Classification problem. Two Poisson spike trains having mean frequency of 20 Hz and length 0.5 sec are labeled as classes 1 and 2. These spike trains are used as input to the LSM, and the readout is trained to identify each class. Next, a jittered version of each template is generated by altering each spikes within the template by a random amount and the task is to correctly identify the class from which it has been drawn.

Task I: Classification of jittered spike trains

Task II: Regression

The next task is a regression problem. Four poisson spike trains, the firing rates of which are modulated by a randomly chosen function r(t) = A + Bsin(2πf t + α) lying in the range (0, 1), are injected into the liquid. At any point of time t, the job of the network is then to give as output the normalized sum of input rates averaged over the last 30 ms.

The performance of LSM-DER for both the tasks is compared with the state-of-the-art LSM variant namely Liquid State Machine with Parallel Perceptron Readout (LSM-PPR).

Task II: Approximation of spike rates

Performance

An important parameter in LSM-PPR is the number of readout neurons denoted by n. The number of synapses in the readout layer is L x n–hence, larger values of n require more synaptic resources. We plot the variation in error for both tasks when n is increased from 1 to 60. The training and testing error of LSM-DER is plotted as a constant line and is always less than that obtained by LSM-PPR. Note that the synaptic resource consumed by LSM-DER is same as that for n = 1 or a single perceptron. Hence, we can conclude that, when n = 1, LSM-DER can attain 3.3 and 2.4 times less error than LSM-PPR with high resolution weights for Task I and II respectively. Moreover, LSM-PPR requires 40−60 times more synapses to attain error levels comparable to LSM-DER.

Performance of LSM-DER compared to the state-of-the-art algorithm

Mismatch resilience

To analyze the stability of the algorithms, the VLSI statistical variations are incorporated during the testing phase of the simulation. We show the performance of both LSM-DER and LSM-PPR when the non-idealities are included for Task I. In the figures, the bars corresponding to τs, cni, and I0 denote the performance degradation when statistical variations of synaptic time constant, dendritic nonlinearity and synaptic kernel amplitude are included individually. Finally, to imitate the true scenario we consider the simultaneous implementations of all the non-idealities, which is marked by (…). From the figure it is evident that when all the variations are included, the MAE of LSM-DER and LSM-PPR increases by 0.0233 and 0.0470 respectively. This concludes that the modifying connections of binary synapses in LSM-DER results in more robust VLSI implementations compared to the adaptation of high resolution weights in LSM-PPR.

LSM-DER is pretty resilient to mismatch

Check out our papers to learn more

1. S. Roy, A. Banerjee and A. Basu, “Liquid State Machine with Dendritically Enhanced Readout for Low-Power, Neuromorphic VLSI Implementations,” IEEE Transactions on Biomedical Circuits and Systems, vol. 8, pp. 681–695, Oct. 2014. [pdf]

2. S. Roy, A. Basu and S. Hussain, “Hardware efficient, neuromorphic dendritically enhanced readout for liquid state machines,” in Proceedings of the 2013 IEEE Biomedical Circuits and Systems Conference (BioCAS), Rotterdam, The Netherlands, Oct. 2013, pp. 302–305. [pdf]

Also checkout my website for more Machine Learning stuffs.

Disclaimer: Views expressed in this post are my personal, individual and unique perspectives, and not those of my employer.

--

--