Paper Reading #1: Assessing the Ability of LSTMs to Learn Syntax-Sensitive Dependencies by Linzen et. at.

Why this paper: Recurrent Neural Networks (RNN( and its family of neural networks are known to very good at many language modelling tasks e.g. machine translation, word embeddings etc. However, what is not know is what do these building blocks learn internally in the case of language which makes them so good at these tasks. This is particularly important because Long Short Term Memory networks (kind of RNNs) are known to memorise the text (or sequences of words), however, to learn language concepts e.g. subject-verb agreement, number prediction or even morphology (not covered in this paper but covered in [1]), intuitive you need more knowledge (or supervision)

What is interesting: Recently there is lot of attention given to break neural networks with adversarial attacks and how to make neural networks proof of them. However, since how neural networks work is not completely understood, this paper is timely in exploring what neural networks learn in case of language. Also, pure linguists are skeptical whether the neural networks can solve language and according to the results in the paper, we need more expressive architectures to achieve this.

How to reproduce results:

Code is available at [2]. To reproduce results follow following steps.

After thoughts:

If we care about solving language using neural networks then we have to start thinking on the lines of more expressive architectures. There is some work in NIPS 2017 in this direction: [3] and [4]. However, if we are serious about solving language using neural networks more work in this direction is required.

References:

  1. http://aclweb.org/anthology/P17-1184
  2. https://github.com/anupamme/rnn_agreement
  3. https://einstein.ai/research/learned-in-translation-contextualized-word-vectors
  4. https://arxiv.org/pdf/1705.08039.pdf