Summary: Do latent tree learning models identify meaningful structure in sentences? (TACL 2018)

Dheeru Dua
UCI NLP
Published in
2 min readOct 23, 2018

Authors: Adina Williams, Andrew Drozdov, Samuel R. Bowman

Recently, there has been a surge in using latent tree models to jointly parse a sentence and learn a downstream task like sentiment analysis, textual entailment, and translation. This paper sheds light on what latent tree models truly learn. The authors came performed various evaluation techniques which focus on facets that a dependency parsing algorithm should learn(more in results section)

Setup: The authors compare two tree-based models — SPINN and ST-Gumbel. SPINN is a shift-reduce style parser which uses true dependency parses, to learn a tracking classifier with targets {shift, reduce} and a TreeLSTM for composing parses if its a reduce step. ST-Gumbel, considers all subsequent word pairs as a candidate parse phrase and uses a hard gating function to output a single latent tree representation. ST-Gumbel learns a latent parse tree with weak supervision from the downstream text entailment task, while SPINN uses true parse tree annotations to learn a joint tree parsing and textual entailment model. Slight variations of SPINN and ST-Gumbel and simple LSTM-based baselines are also considered.

Results:

  • Accuracy on textual entailment: ST-Gumbel beats all the baseline
  • Grammar Induction: The authors compare the grammar learned by these models with PTB grammar. They found that these models outperformed only on rare nodes like INTJ(interjections) and LST(list markers) and on more common node types like ADJP, NP, and PP they perform only slightly better than random trees.
  • SPINN model overwhelmingly creates left-branching trees. Tree from ST-Gumbel are more balanced but don’t promote phrase and give preference to nominal compositions.
  • ST-Gumbel learns a decent strategy to handle negation but are not very consistent with the treatment of function words like determiners or prepositions.
  • Consistency: Upon introducing an outlier the authors found that the above two models are equally brittle as a simple LSTM baseline.

Thoughts: The authors found that latent tree model ST-Gumbel learns shallow parses which provide an efficient signal over true dependency parses, used in SPINN, which can be more prolific. However, I think it could just be because the downstream task used for training and evaluation was textual entailment. Text entailment does require semantic understanding but a shallow parser is probably enough to solve this task. A study on how different tasks like reading comprehension, machine translation, and classification leverage the sentence syntax could be an interesting direction.

--

--