Awesome AI Papers: Ranking Sentences for Extractive Summarization with Reinforcement Learning

James Lee
Nurture.AI
Published in
4 min readApr 23, 2018

This article is part of a weekly series of AI paper summaries. Check out more at the nurture.ai medium publication or the official nurture.ai website.

Overview

Visualization of the model used in this paper.

The paper Ranking Sentences for Extractive Summarization with Reinforcement Learning proposes an Extractive Summarization Model with Reinforcement Learning — a model which ranks sentences based on their summary-worthiness based on the Recall-Oriented Understudy for Gisting Evaluation (ROUGE) evaluation metric. ROUGE is a common metric used to evaluate the quality of summarization models. A learning agent ranks sentences then construct a summary from the top-ranked sentences.

Previous state-of-the-art & its problems

Previously, state-of-the-art extractive summarization models heavily utilize cross-entropy loss training, which leads to 2 kinds of discrepancies. Firstly, the disconnect between the training objective and the objective of the task. Cross-entropy training aims to maximize the likelihood of ground-truth labels (whether the sentence is in the summary). Whereas the task objective is to generate summaries with high ROUGE scores. The second discrepancy comes from the reliance of ground-truth labels. This is because the documents do not inherently come with sentences labels indicating whether or not the sentence should be included in a summary.

Insight of paper

A simple Reinforcement Learning Agent

The authors realized that they had to introduce a link between the two discrepancies above. To do so, they added a reinforcement learning agent aimed to maximize the ROUGE score component to bridge this gap. Training the agent to select sentences with the ROUGE evaluation metric as it’s reward scheme.

How insight was harnessed

The model consists of 3 components: a sentence encoder, a document encoder, a sentence extractor with a learning agent. The sentence encoder is a Convolutional Neural Network (CNN) that encodes sentences into its continuous representation. Temporal narrow convolutions are applied to each sentence to generate feature lists. Using a kernel filter sized K and a window with width h, feature maps are produced over each possible window of words. Max-pooling over time is applied to the feature maps after. The authors used kernel sizes of 2 and 4, repeating this 3 times to construct the sentence representations.

The document encoder generates a document representation, done by composing sequences of sentences. A Recurrent Neural Network (RNN) is used, equipped with Long-Short Term Memory (LSTM) cells to avoid the vanishing gradient effect. The sentences from the document are fed in reverse order so that the model considers sentences at the top which tend to be important.

Similarly, the sentence extractor comprises of an RNN with LSTM cells and a softmax layer. Conditioned on the document representations generated from the document encoder and previously labelled sentences, the sentence extractor selects the best candidate sentences that will be used in a summary. These candidates are selected based on a ranking generated by a reinforcement learning agent. The learning agent is trained to rank sentences that directly optimize the ROUGE evaluation metric, maximizing its score.

Results

After putting the model through a number of tests, the authors concluded that their method termed “REFRESH” and a gold standard of summaries (hand-crafted by humans) had no significant differences. However, the authors note that the summaries generated by REFRESH were still discernible from the hand-crafted ones.

Competing approaches

Model summarization methods come in 2 flavours — extractive systems which rank sentences based on their probability of being in the summary and abstractive systems which involves various text rewriting operations (e.g. substitution, reordering, etc).

Multi-document Abstractive Summarization Using ILP Based Multi-sentence Compression (Banerjee et al., 2016) studies an interesting approach for document summarization using a novel method called ILP. ILP selects the best shortest paths in a word-graph to maximize information content and linguistic quality of a summary. At the time, the method beat all baseline approaches.

Industry Implications

Flowery language, exciting phrases and the often spoken about buzzword might serve to drive marketing reach. A good summarization model can certainly offset the number of redundant words we have to read before we get to the important bits of context. The work of this paper drives forward a different aspect of articles — quality.

With the ability to quickly obtain the important information of a written piece, we can continue to accrue more and more relevant information at a faster rate. This reduces time and effort required to look for information.

Another industry sector this work has implications is in education, where content in quality and fact driven instead. Similarly, there will be exciting use cases for this in fields like politics, economics, literature and many more.

Interested to read more? Head over to nurture.ai to view more weekly paper summaries and discuss interesting question left open by the paper here.

--

--

James Lee
Nurture.AI

Future Tech. Ai, Blockchain and game design enthusiast. AI Research Fellow at Nurture.Ai & moderator of the FB group Awesome AI Papers