Leveraging BERT for Extractive Text Summarization on Lectures

Research Paper Summary

Prakhar Mishra
Analytics Vidhya

--

Leveraging BERT for Extractive Text Summarization on Lectures | Research Paper Summary | TechViz — The Data Science Guy
Image by Author

Background and Introduction

Automatic text summarization is the process of shortening a set of data computationally, to create a subset that represents the most important or relevant information within the original content. In addition to text, images and videos can also be summarized — Wikipedia.

There are two ways to compress/summarize any given text — Extractive and Abstractive. Extractive summarization can be seen as the task of ranking and scoring sentences in the document based on certain metrics and then picking top-k sentences as the representative summary of the input the document, whereas, Abstractive summarization can be seen as rephrasing or using the new words, instead of simply extracting the important sentences generating a summary of the input document yet retaining the main essence of the document and being linguistically correct at the same time.

In this blog from Georgia Institute of Technology, titled as Leveraging BERT for Extractive Text Summarization on Lectures, we will go through the extractive summarization technique, but if you are interested in learning about abstractive summarization, make sure to check out this video.

--

--