Text Classification with Transformers: Zero-Shot Classification Pipeline

Seffa B
3 min readJun 11, 2023

--

Classifying text can be an incredibly challenging task, especially when the text has not been labeled. The time-consuming process of manually annotating texts can be overwhelming. Fortunately, transformers and their zero-shot models provide a solution that saves time and effort. In this tutorial, we will leverage the Transformers library to perform text classification using the zero-shot classification pipeline.

Transformers is a powerful Python library that enables developers to utilize state-of-the-art transformer models for natural language processing (NLP) tasks. These models, such as BERT, GPT, and RoBERTa, have achieved remarkable performance in various NLP tasks, including text classification, named entity recognition, and machine translation. Transformers provides an easy-to-use interface for utilizing pre-trained transformer models and enables efficient and accurate NLP applications.

Install the Required Library:
Transformers To begin, we need to install the Transformers library. Open your terminal or command prompt and execute the following pip command:

pip install transformers

Testing the Zero-Shot Pipeline:
Once the library is installed, we can proceed with testing the zero-shot classification pipeline. In this example, we will classify 3 texts into three custom labels: “Sport,” “Politics,” and “Environment.”

Start by importing the necessary modules:

from transformers import pipeline

Next, we will initialize the zero-shot classification pipeline using the pipeline function from the Transformers library:

classifier = pipeline("zero-shot-classification")

Now, let’s classify some sample texts using the created pipeline. Replace sentences inside 'texts' with your own text:

texts = [
"Djokovic has qualified for his sixth French Open final!",
"On March 23, 2010, President Obama signed the Affordable Care Act into law, putting in place comprehensive reforms that improve access to affordable health coverage for everyone and protect consumers from abusive insurance company practices",
"The goal was to show that scientists from various disciplines, diverse cultures and countries at different stages of development could find common ground about the conditions for triggering climate action in the current economic context"
]

labels = ["Sport", "Politics", "Environment"]

for text in texts:
result = classifier(text, labels)
print(f"Text: {text}")
print("Predicted Labels:")
for label, score in zip(result["labels"], result["scores"]):
print(f"- {label}: {score}")
print()

In this example, the classifier pipeline classifies each text into the custom labels specified in the labels list. The predicted labels and their corresponding confidence scores are then displayed. Let’s look at the results below:

As you can see, each sentences has been correctly classified.

Conclusion
In this tutorial, we explored the concept of text classification using transformers and the zero-shot classification pipeline. We introduced the Transformers library, installed it, and demonstrated how to use the zero-shot classifier to classify text into custom labels without the need for manual annotation. This approach significantly saves time and effort, making it a valuable tool for various text classification tasks

--

--

Seffa B

I'm Data Scientist. I'm passionate about topics related to Data Science. I'm also a Sports Enthusiat