Text classification with transformers in Tensorflow 2: BERT

Published in

Atheros

10 min readMay 5, 2020

Introduction

The transformer-based language models have been showing promising progress on a number of different natural language processing (NLP) benchmarks. The combination of transfer learning methods with large-scale transformer language models is becoming a standard in modern NLP. In this article, we will make the necessary theoretical introduction to transformer architecture and text classification problem. Then we will demonstrate the fine-tuning process of the pre-trained BERT model for text classification in TensorFlow 2 with Keras API.

Text classification — problem formulation

Classification, in general, is a problem of identifying the category of a new observation. We have dataset D, which contains sequences of text in documents as

where Xi can be for example text segment and N is the number of such text segments in D.

The algorithm that implements classification is called a classifier. The text classification tasks can be divided…

Text classification with transformers in Tensorflow 2: BERT

Introduction

Text classification — problem formulation

Written by David Mráz