Summarize document by combing extractive and abstractive steps

Summarization

Edward Ma
DataSeries

--

In NLP, there are two approaches to do the text summarization. The first one, extractive approach, is a simple approach which is extracting key words or sentences from article. There are some limitations and proved that the performance is not very good. The second one, abstractive approach, is generating a new sentences base on given article. It needs more advanced technique.

We present a method to produce abstractive summaries of long documents that exceed several thousand words via neural abstractive summarization. We perform a simple extractive step before generating a summary, which is then used to condition the transformer language model on relevant information before being tasked with generating a summary. We show that this extractive step significantly improves summarization results. We also show that this approach produces more abstractive summaries compared to prior work that employs a copy mechanism while still achieving higher rouge scores.

Rather than summarize the document by either extractive approach or abstractive approach, Subramanian et al. (2019) propose to use both approaches to do it and the above summary is generated by their proposed model.

Overview

--

--

Edward Ma
DataSeries

Focus in Natural Language Processing, Data Science Platform Architecture. https://makcedward.github.io/