# Notes on “Supervised learning in spiking neural networks with FORCE training”

Paper available here.

# Abstract

This paper demonstrates the applicability of the FORCE* *(not to be confused with the REINFORCE algorithm in the reinforcement learning literature) supervised learning method to *spiking neural networks* (SNNs). It’s used to train SNNs to mimic dynamical systems, classify inputs, and store discrete sequences, as well as to model two model circuits. FORCE-trained SNNs reproduce comparable complex behaviors to those of their inspired circuits and yield information not readily available from pharmacological manipulations / spike timing statistics.

# Introduction

Claim: humans don’t use control theory / machine learning (ML)/ etc. to solve problems, they use SNNs (This is self-evident. But, that’s not to say concepts from control theory / ML aren’t implemented in biological SNNs.).

A broad class of methods have been presented that allow one to enforce certain behavior / dynamics onto SNNs. These top-down techniques begin with an intended task for an SNN, and determine what synapse strengths should be to achieve it. Dominant approaches include the FORCE method, spike-based / predictive coding networks, and the neural engineering framework (NEF).

While NEF and spike-based coding approaches creating functional SNNs, they aren’t agnostic toward the underlying network of neurons. Additionally, in order to apply either approach, the task must be specified in closed-form differential equations. Despite these constraints, both methods have led to a resurgence in top-down analysis of network function.

FORCE training is agnostic toward both the SNN and the task the network is trained to solve. FORCE training takes any high-dimensional dynamical system (a *reservoir*) and uses its dynamics to compute. The target behavior doesn’t need to be specified in differential equations; all that’s needed is a supervising error signal. FORCE is therefore applicable to many more types of systems and tasks. Unfortunately, FORCE training has only been implemented in networks of firing rate-based neurons, with little work done on SNN implementation.

It is shown that FORCE training can be applied to SNNs, and is robust against different implementations, neuron models, and supervising signals.

# Results

## FORCE training weight matrices

The authors explore applying FORCE to train SNNs to perform arbitrary tasks. The synaptic weight matrix in these networks is a sum of a set of static weights (set to initialize the network into chaotic spiking) and a set of learned weights (determined online using the Recursive Least Square (RSL) supervised learning method). The goal of RLS is to minimize the squared error between the network dynamics and the target dynamics, considered successful if the network dynamics mimic the target dynamics post-training.

## FORCE trained rate networks learn using chaos

To demonstrate the method and compare with SNNs implementations, FORCE is applied to a network of rate-based neurons (rate equations). The static weight matrix initializes high-dimensional chaotic dynamics, which form a suitable* *reservoir to allow the network to learn from a target signal quickly. RLS is activated after a short initialization period, and, post-training, the networks is able to reproduce a 5Hz sinusoidal oscillator with slight frequency / amplitude error.

## FORCE trained spiking network learn using chaos

FORCE training is implementing in SNNs with different spiking neurons to compare the robustness of the method across neuron models (namely, theta, leaky integrate-and-fire (LIF), and Izhikevich models).

First, to demonstrate the chaotic nature of the networks, single spikes are deleted. After deletion, spike trains immediately diverge, indicating chaotic dynamics. All neuron models exhibit bimodal *interspike-interval (ISI) distributions* indicative of possible transitions to rate chaos for different proportions of static and learned weights.

FORCE training in rate-based neurons works by quickly stabilizing firing rates, and subsequent changes are made to “stabilize” it. For fast learning, RLS has to be applied on a faster time scale in SNNs than in rate-based networks. The learning rate was made fast enough to stabilize the spiking basis during the first supervising signal.

With these modifications, FORCE is successfully applied to trains SNNs to mimic various oscillator types (sinusoids at different frequencies, sawtooth, Van der Pol, and with noisy teaching signals). All neuron types were able to learn to reproduce all oscillator types (with minor modifications to the theta neuron model). FORCE training was robust to initial chaotic network states. Oscillators with higher (lower) frequencies are learned over larger parameter regions in networks with faster (slower) synaptic decay time constants. For SNNs, the authors observe that, in some cases, systems with dominant eigenvalues outperform systems without, and in other cases, the opposite is true.

For the considered target dynamics, the Izhikevich model had the greatest accuracy and fastest training times, apparently due to its spike frequency adaptation, which operates on a long time scale. The long time scale affords the SNN reservoir a greater memory capacity, allowing the learning of longer signals.

Since oscillators are simple dynamical systems, the authors considered two more complicated target dynamics: those of a low-dimensional chaotic system, and statistically classifying inputs applied to a network of neurons. A SNN of theta neurons was able to reproduce the butterfly attractor and Lorenz-like trajectories post-training. Since the supervising dynamics were more complex, training took longer and required more neurons (5,000 neurons and 45 seconds of training). The SNN achieved comparable performance to a network of rate-based neurons, and both networks were able to reproduce the stereotypical Lorenz tent map. However, in general, the rate-based network performed somewhat better. The authors show that populations of neurons can be FORCE-trained to classify inputs, similar to a feed-forward network.

## FORCE training spiking networks to reproduce complex signals

The authors wondered whether FORCE-trained networks could encode signals similar to those from naturally-occurring spatio-temporal sequences (e.g., songbird singing). This is formulated as very long oscillations that are repeatedly presented to the network.

The first pattern considered is a sequence of pulses in a 5-dimensional supervising signal, corresponding to the notes in the first four bars of Ode to Joy. A network of Izhikevich neurons was able to reproduce the bars. Networks of all three neuron types were able to reproduce the bars when using larger synaptic decay time constants.

The networks displayed some *stereotypical* (as opposed to *random*) errors, occurring in places where sub-sequences of notes were not unique.

## FORCE trained networks can reproduce songbird singing

The authors construct a circuit that reproduces a birdsong (in the form of a spectrogram) recorded from an adult zebra finch. The learned singing behavior of these birds is owed to two primary nuclei: the HVC and the Robust nucleus of the Arcopallium (RA). The RA-projecting neurons in HVC form a chain of spiking activity and each fires only once at a specific time in the song. This chain of firing is transmitted to the RA circuit, wherein each neuron bursts at multiple times during the song.

The authors focus on a single network of RA neurons, and model the HVC chain of inputs as a series of successive pulses (not FORCE-trained for simplicity). The pulses are fed into a population of Izhikevich neurons that are successfully FORCE-trained to reproduce the spectrogram of the recorded birdsong. With certain parameter settings, the spiking statistics of RA neurons are reproduced both qualitatively and quantitatively.

Manipulations are made to the excitatory synapses which alter the network’s excitatory-inhibitory balance. The network is robust to down-scaling excitatory weights, still reproducing the song at a lower intensity. Upscaling excitatory weights by 15% or more drastically reduced song performance, and by 20% or more, replaced the singing with high-intensity, seizure-like activity. A similar result is observed through injection of bicuculline in RA.

## High-dimensional temporal signals improve FORCE training

The authors hypothesize the performance of the songbird network was associated with the precise, clock-like inputs from the HVC, and that similar inputs could aid in encoding / replay of other types of signals. To test this, they removed the HVC input pattern and found the replay of the learned song was destroyed, similar to experimental lesioning of this area in adult canaries. These types of signals are then referred to as high-dimensional temporal signals (HDTS).

A network of Izhikevich neurons is FORCE-trained to internally generate its own HDTS while simultaneously being trained to reproduce the first bar of Ode to Joy. The network is able to learn both signals at once, with less training time and greater accuracy than without the HDTS. Another network is FORCE-trained to learn the *first* *four bars *in additional to its own 64-dimensional HDTS, again learning both signals without any error in the sequence of notes. Thus, internally generated HDTS can make FORCE-training faster / more accurate / more robust to longer signals.

## FORCE-trained encoding and replay of an episodic memory

The authors wanted to know whether HDTS input signals could help neurons to learn natural high-dimensional signals. They trained a network of Izhikevich neurons to learned a 1,920-dimensional supervisor that corresponded to the pixels of an 8 second movie clip. The HDTS’s were either generated by a separate network or fed directly as input into an encoding / replay network. In the former case, an HDTS can be easily learned by a network, and in the latter case, we can freely manipulate the HDTS. The HDTS could also be learned jointly with the movie scene, constituting a 1,920 + 64 dimensional supervisor.

The networks was successfully trained to replay the movie clip in both cases. The HDTS inputs were necessary for both training and replay: networks could still replay individual frames from the clip without the HDTS, but the order of frames was incorrect and apparently chaotic.

Replay network performance decreased approximately linearly with the proportion of neurons removed from the network, with the amplitude of mean activity decreasing as well. The HDTS network was much more sensitive: randomly lesioning 10% of HDTS neurons stopped the network’s output.

It is speculated that compressed or reversed replay of an event might be important for memory consolidation. The authors wonder if networks trained with HDTS could replay the movie clip in accelerated time by compressing the HDTS in time post-training. The replay network is able to replay the movie in compressed time *up to a compression factor of 16x* with accuracy dropping sharply for further compression. The effect of the time compression on the mean activity was to introduce *high-frequency oscillations*, whose frequency *scaled linearly with the degree of compression*. With increasing compression, *large waves of synchronized activity* emerged in the mean population activity. Reverse replay was also successful, obtained by *reversing the order of the HDTS components*.

Compression of task-dependent sequence of spikes has been observed experimentally; e.g., a recorded sequence of neuronal cross correlations in rats elicited during a spatial sequence task reappeared in compressed time during sleep, with compression ratios between 5.4–8.1. This is similar to the compression ratios achieved using the movie-replay networks without incurring significant error in replay.

# Discussion

Main take-away: FORCE-training can be used to train initially chaotic SNNs to mimic the functions / dynamics of populations of neurons. This is agnostic to neuron model and supervising signal.

High-dimensional temporal signals (HDTS) are used to aid networks to learn complicated dynamics by separating neurons into time-dependent assemblies. These signals made FORCE-training faster and more accurate, and by manipulating them, replay networks could compress and even reverse replay.

The HDTS conferred a slow oscillation in the mean population activity reminiscent of slow theta oscillations in the hippocampus, which is associated with memory (proposed to serve as a *clock* *for memory formation*).

The authors argues that they’ve shown an example of how natural stimuli (serving as proxies for memories) can be bound to underlying oscillations in a population of neurons, which forces the neurons to fire in distinct temporal assemblies. This mirrors experimental results which show that theta power was predictive of correct replay. Blocking hippocampal theta oscillations has been found to disrupt learning, similar to how block the HDTS prevents learning and accurate replay with networks trained with HDTS present.

FORCE-trained networks could be used to elucidate hippocampal function , and can also be constructed to represent different components of the hippocampal circuit.

FORCE training allows one to use any sufficiently complicated dynamical system for universal computation. In SNNs, this is difficult because of the balance of the fixed, chaos-inducing weights, and the learned feedback weights. If the former is too strong, the feedback weights cannot control the behavior of the network, and if it is too weak, the system’s dynamics can no longer form a suitable reservoir.

At present, all top-down procedures for constructing functional SNNs (including FORCE training) need more work in order to be considered biologically plausible. However, synapse parameters should not be considered biologically implausible simply because the methods that generated them are.

FORCE-trained rate-based networks have been successful in accounting for / predicting experimental data. Therefore, FORCE-trained SNNs may be useful for generating predictions involving voltage traces, spike times, and neuronal parameters.