Text Multi-Classification with Flair NLP

Effortlessly Classify Text with Flair’s Cutting-Edge Technology

Ahmad Suhail

Published in

Tensor Labs

8 min readMar 27, 2023

Classifying the document type based on document content

This article will take you through an end-to-end pipeline while solving text classification with Flair (NLP model)

Are you tired of manually classifying large amounts of text data? Are you tired of trying to achieve accuracy on your text classification dataset? Look no further because Flair is here. Introducing Flair NLP, the perfect tool for multi-classification text labeling.

Hey, there fellow ML Engineer! Hope you are enjoying your career as an ML Engineer. If you have stumbled upon this article on Flair, haha chances are you couldn’t get any help from their documentation.

Let’s start this journey with a text classification punch line,
“ Why was the text classification algorithm always calm and composed? Because it had a good classifier! ”

Supervised machine learning (ML) involves various important tasks, among which text classification holds great significance. This task involves tagging or categorizing documents to enable automatic and efficient structuring and analysis of text in a cost-effective manner. Text classification is a fundamental process in Natural Language Processing, with diverse applications such as sentiment analysis, spam detection, topic-labeling, and intent detection.

In this article, me and you will go through a text classification pipeline from preprocessing to training to inference. We’ll dive into the benefits and how-to’s of using Flair NLP for your next text classification project. Get ready to streamline your workflow and achieve more precise results! Let’s get started.

The Colab notebook can be accessed here https://colab.research.google.com/drive/1Jpg48czQyKlOgaGzIhT7yll5NU2G82Pt?usp=sharing

Please note that this article assumes familiarity with NLP concepts such as word embeddings, stopwords, and so on.

What is Flair

Flair is a simple easy-to-use open-source Natural Language Processing library that aims to accomplish an array of tasks such as sentiment analysis, named entity recognition (NER), part-of-speech tagging, text classification, and so on. One of its main aims is to provide easy usability to its users, be it in research or development.

At its core, Flair embeddings are context-aware token representations that have been fine-tuned for specific tasks. This means that they not only capture general syntactic and semantic information about a piece of text but also task-specific information.

Flair can be as easy to use as just 7 lines of code.

from flair.data import Sentence
from flair.models import SequenceTagger

# create a sentence
sentence = Sentence('Ahmad lives in Islamabad')

# load the NER tagger
tagger = SequenceTagger.load('ner')

# predict the named entities in the sentence
tagger.predict(sentence)

# print the predicted named entities
for entity in sentence.get_spans('ner'):
    print(entity)

In just 7 lines of code, you have loaded the pre-trained model of Named Entity Recognition (NER) and used it to predict a sentence’s entities. But, we are here for more than this right?

From Data Preprocessing to Classifying Text

The beauty of using the Flair library lies in its simplicity; its intuitive API makes it easy even for beginners to get started quickly without having detailed knowledge about how neural networks work flawlessly underneath their code. On top of that, there are many cool features like embedding enhancements which allow quick experimentation. What’s more interesting here is that unlike traditional libraries such as Spacy or NLTK where we had to do manual feature engineering — which sometimes was required separately from training — creativity from data scientists could be fully used by allowing them directly train models while engineers take care of all the low-level aspects.

E-Commerce Dataset

For this short tutorial, I am using a Kaggle dataset of e-commerce, but the main idea is to make you understand how to use Flair for your classification problem.

This dataset consists of two columns, category and product description.

As you can see there are 50000+ rows of data in this dataset, when you are performing text classification it is important you check for any Nan values in the dataset. Everyone has their own way of dealing with Nan values, I like to drop the rows consisting of Nan values in case of text classification.

Now that we have dropped the only Nan value, we will now perform the text preprocessing steps for better text classification.

Data Preprocessing

First, let’s import the necessary libraries for this tutorial. You will need to install the Flair library. You’ll also have to import the NLTK library, this is my go-to library for dealing with NLP preprocessing tasks.

Let’s create functions for processing the text so that we can also use them at the time of inference. The preprocess_text function turns every letter of the text into lowercase, then removes any special characters and extra spaces. Then we use NLTK English stopwords to remove any words that do not hold any value.

Stopwords are usually ‘the, an, am, I, etc’. After removing the stopwords we use NLTK’s Lemmatizer to join inflected forms of the same word. Word Lemmatizer kind of converts the different types of the same words to a single type. For example, if the word is ‘fetch or fetches’ then the word ‘fetches’ will be converted to ‘fetch’.

After applying the text processing function to the data frame, our data will now look like this.

Our next step is to convert this clean description into a form that is recognizable by Flair Model. The Flair model uses data labels in form FastText format.

Congratulations!!! You have completed the steps required for text preprocessing. Now the only tasks left are training and inference. So on we go.

Training the Dataset

Our next steps are to split the dataset into train, valid, and test datasets. For this, we will use our trusted Sklearn’s train test split function.

You will need to save your split Txt data files in the data_folder path with each file titled appropriately i.e. train.txt, test.txt, dev.txt. This is because the corpus initializers will automatically search for the train, dev, and test splits in a folder. You can also use CSV file type.

Now the moment we are waiting for, training our Flair model.

First, load the text corpus using Flair Classification Corpus. Then we will make a label dictionary from this corpus. The label dictionary detects the different types of labels in the dataset. In this example, these are the encoded labels, 0: Books, 1: Clothing & Accessories, 2: Electronic, 3: Household

Here we are also converting our clean text to word embeddings using the DistilBERT model. This is one of the most exciting features of Flair that you can use any word embedding model without worrying about anything. Usually, when training other models you have to manually create features for text classification using TfIdf Vectorizer, word counts, or using word embeddings but the Flair library provides ease of use in this regard. You can use BERT, RoBERTa, Glove, or any other word embedding model.

Lastly, after making word embeddings we load the text corpus onto the Model Trainer for starting training.

Since our dataset is slightly imbalanced, we will use the built-in Flair function of the Imbalance Classification Dataset Sample to cater to this problem.

After executing this command your model training will start. Please make sure you use GPU for training to reduce training time. Haha, otherwise you’ll be waiting forever. Keep the learning rate low because if your learning rate is set too high, it can cause undesirable divergent behavior in your loss function.

Keep the batch size reasonable, usually, it is 16 or 32 to not overload the GPU memory.

These are our results after training the model for 2 epochs. Don’t worry, you can stop the model training early if you don’t have enough time. The model will save your best model to the data folder you have mentioned.

As you can see, the model achieved 96% accuracy on the test dataset.

Inference (Predicting the Label)

Now that we have trained our model and have got our hands on the weights file. We are ready to do some predictions.

Load the Flair library’s Text Classifier function and Sentence function. The Text Classifier function will load your trained weights and make your model ready for predictions.

Let’s take any random product description from the internet, in this case, I am using an Apple product description for the prediction. Will this model be able to predict whether this description is about a laptop (electronic) or not?

The model successfully predicted the category type of the product description.

The label's value tells you about the predicted label. The length function tells you about the word count of the sentence. The label score tells you about the confidence score of the prediction.

As you can see the model has confidently predicted the product category with 99% confidence.

Better than ML algorithms?

It has been argued that the Flair NLP library outperforms traditional machine learning algorithms when it comes to text classification. This is due to the fact that Flair NLP takes into account the context of words in a sentence, which is crucial for understanding the meaning of the text. In addition, Flair NLP is able to handle different types of data, such as Tweets and blogs, which makes it more versatile than other machine-learning libraries.
Ultimately, it is difficult to say which approach is best without knowing the specific text classification task and data set. Different approaches may excel or fail depending on the size and complexity of the data set. Therefore, it is important to experiment with different methods and compare results to determine which works best for your particular task.

Summary

Flair provides a range of other components and tools that can be used to build and train NLP models, including embeddings, pre-processing tools, training frameworks, and evaluation metrics. With its flexible and customizable architecture, Flair is a powerful tool for anyone looking to build and train state-of-the-art NLP models for a variety of tasks.