Text classification with transformers in Tensorflow 2: BERT

David Mráz
Atheros
Published in
10 min readMay 5, 2020

--

Introduction

The transformer-based language models have been showing promising progress on a number of different natural language processing (NLP) benchmarks. The combination of transfer learning methods with large-scale transformer language models is becoming a standard in modern NLP. In this article, we will make the necessary theoretical introduction to transformer architecture and text classification problem. Then we will demonstrate the fine-tuning process of the pre-trained BERT model for text classification in TensorFlow 2 with Keras API.

Text classification — problem formulation

Classification, in general, is a problem of identifying the category of a new observation. We have dataset D, which contains sequences of text in documents as

where Xi​ can be for example text segment and N is the number of such text segments in D.

The algorithm that implements classification is called a classifier. The text classification tasks can be divided…

--

--

David Mráz
Atheros

Founder of https://atheros.ai | Artificial Intelligence | Machine Learning | JavaScript, Python, C++ | Business