Reverse engineering RNN reveals line attractor dynamics — Read a Paper
With the recent trend of not stopping with performance analysis but rather extending it to understand the unfathomable interpretability and dynamics of the deep learning models, the paper talks about the learning dynamics of the Recurrent Neural Networks (RNNs) in dynamical system point of view. The observations put forth in the work are,
Training reduces the dimensions of hidden unit activity
RNN is trained with benchmark sentimental analysis dataset and the hidden unit activities of RNN is collected at each stage. At each stage PCA (Principal Component Analysis) is applied to check the required components to capture the activity. As training proceeded, the number of components of PCA required to explain variance reduced. This shows that, the training model indirectly impose parsimony but also manages to achieve generalization.
State spaces for different classes physically separate
When training is started, the state vector for different class examples lies very close to each other, and as the training proceeds, they devolve further away. This shows that, the feature extraction to acquire distinction capability is also reflected in the geometry of the state space and that is clearly interpretable in visualization.
Existence of Fixed points
During training, the RNN state vector follows a set of fixed points. In dynamical system point of view, a fixed point acts like a sink, that attracts the system to stay in that particular point. To kick the system out of the point, a sufficiently large force must be applied or the system will remain immutable or move away and rapidly fall back again. In deep learning perspective, the force is attributed to the saliency of the training example. When the data sample is neutral or has less salience (feeble positive or negative sentiment), the model in the fixed point is unaffected. But in the other hand, when a strong positive or negative review (example) is encountered, the model jumps out of the fixed point and settles in other fixed point. This also accounts for the robustness of the deep learning model against noise. And the manifold of all such fixed points lie one dimensional. That gives, another super cool conclusion that, the training model, not only settles in low dimensional state space but also forms 1d manifold of fixed points.
Analysis of fixed point
To find the approximate fixed points numerically, the paper defines a loss function, q = 1/N (Absolute Error(h — F(h, 0)). The function simply has a criterion to minimize the difference of hidden state activity with the one that is obtained when zero is given as input. When the hidden activity, is close to that of F(h,0), we can approximate that the point is fixed, and the function is optimized using optimization techniques. This again is consistent with the definition of fixed point, that is being immune to neutral (zero) and attracting the system quickly. Also, the authors verified whether the point is fixed by giving slight positive/negative example to perturb the system and checked the rate at which it falls back again.
Stability of fixed point
The authors also did stability analysis to check the nature of fixed point. Briefly stating, the first-order Taylor expansion is used to approximate hidden state update to get Jacobian. After eigen decomposition, left and right eigenvectors are obtained, to check the stability of the fixed point.
The experimentation is done across various RNN types, architectures and datasets to prove universality. The parsimony and linear approximation of learning dynamics of RNNs, draw many parallel analogies. One interesting parallel with Neuroscience is, the attractor dynamics that is believed to be observed in hippocampus [Ref 1] (known to play vital role in memory and learning), assumed to use simple low dimensional grid cells to form complex high dimensional representations[Ref 2].
Hope you enjoyed reading :)
References
- Tom J. Willsm Colin Lever, Francesca Cacucci, Neil Burgess, John O’Keefe (2005) Attractor Dynamics in the Hippocampal Representation of the Local Environment. Science Vol. 308, Issue 5723, pp. 873–876
- Klukas M, Lewis M, Fiete I (2020) Efficient and flexible representation of higher-dimensional cognitive variables with grid cells. PLoS Comput Biol 16(4): e1007796.