Abstractive Text Summarization Using Transformers

An exhaustive explanation of Google’s Transformer model; from theory to implementation

Rohan Jagtap
The Startup

--

This article is an extension to the ‘Transformers Explained’ post. The post, essentially, is an in-depth elucidation of the famous Transformer model which is a novelty of Google Research. If you’ve already been through it, skip to contents. If you’re new, consider giving it a read if you’re interested in knowing the logic behind the working of the Transformer.

This article is a step-by-step guide for building an Abstractive Text Summarizer for generating news article headlines using the Transformer model with TensorFlow. Following are the contents of this post:

Contents

  1. A Brief Introduction to Abstractive Summarization
  2. The Dataset
  3. Preprocessing
  4. Utility Functions
  5. The Model
  6. Training the Model
  7. Inference
  8. Conclusion

A Brief Introduction to Abstractive Summarization

Summarization is the ability to explain a larger piece of literature in short…

--

--

Rohan Jagtap
The Startup

Immensely interested in AI Research | I read papers and post my notes on Medium