Python Libraries for Sentiment Analysis — a study on what to choose

5 min readApr 8, 2024

Sentiment analysis is and always has been a buzzing topic or base in NLP community. How to approach it ? What library to choose ? It can be seen below. Each has its own uniqueness and depends on the user what to choose according to their use case (for example : accuracy or time management).

VADER (Valence Aware Dictionary Sentiment Reasoner)

One popular tool for performing sentiment analysis in Python is the NLTK (Natural Language Toolkit) Vader library.

Vader is designed specifically for analyzing sentiment in social media text, such as tweets, comments, and reviews. It combines a lexicon of sentiment-related words with a set of rules for interpreting the sentiment intensity of a given text.

When analyzing a piece of text with Vader, it provides four key metrics:

Compound Score: This is a normalized, weighted composite score that ranges from -1 (most negative) to +1 (most positive). It represents the overall sentiment of the text.
Negative Score: The proportion of the text that is identified as having a negative sentiment.
Neutral Score: The proportion of the text that is identified as having a neutral sentiment.
Positive Score: The proportion of the text that is identified as having a positive sentiment.

Here’s a different way to explain sentiment analysis using NLTK Vader :

!pip install nltk
import nltk
from nltk.sentiment import SentimentIntensityAnalyzer

# Initialize the sentiment analyzer
analyzer = SentimentIntensityAnalyzer()

# Sample text
text = "The movie climax was god awful! Great acting and an engaging plot though."

# Get the sentiment scores
scores = analyzer.polarity_scores(text)

# Print the scores
print(f"Compound Score: {scores['compound']}")
print(f"Negative Score: {scores['neg']}")
print(f"Neutral Score: {scores['neu']}")
print(f"Positive Score: {scores['pos']}")

TextBlob

TextBlob is a Python library that provides a simple and intuitive interface for performing various natural language processing (NLP) tasks, including sentiment analysis. Unlike NLTK Vader, which is specifically designed for sentiment analysis on social media text, TextBlob offers a more general-purpose approach to text processing.

When performing sentiment analysis with TextBlob, it returns a named tuple containing two key values: polarity and subjectivity. The polarity score ranges from -1.0 to 1.0, where a negative score indicates a negative sentiment, and a positive score indicates a positive sentiment. A score close to 0 suggests a neutral sentiment.

On the other hand, the subjectivity score ranges from 0.0 to 1.0. A subjectivity score of 0.0 suggests that the text is very objective, while a score of 1.0 indicates a highly subjective text. This score can be useful in determining whether the text expresses an opinion or a factual statement.

Here’s an example of how to use TextBlob for sentiment analysis:

from textblob import TextBlob

# Sample text
text = "The movie was a disappointment. The acting was mediocre, and the plot was unoriginal."

# Create a TextBlob object
blob = TextBlob(text)

# Get the sentiment scores
sentiment = blob.sentiment

# Print the scores
print(f"Polarity: {sentiment.polarity}")  # Negative sentiment
print(f"Subjectivity: {sentiment.subjectivity}")  # Subjective text

Stanza

Stanza is a powerful natural language processing (NLP) toolkit developed by Stanford University, offering a wide range of functionality, including sentiment analysis. While Stanza excels in supporting sentiment analysis across 66 different languages, it takes a more straightforward and coarse-grained approach to sentiment classification.

When using Stanza’s sentiment analysis capabilities, the output is a discrete value ranging from 0 to 2, representing negative, neutral, and positive sentiment, respectively. Unlike some other sentiment analysis tools that provide a continuous or finer-grained sentiment score, Stanza’s sentiment method does not differentiate between varying degrees of negativity or positivity.

This means that Stanza does not provide nuanced sentiment ratings such as “slightly negative” or “overly positive.” Instead, it assigns a sentiment label based on the overall sentiment polarity detected in the input text. If you require a more granular sentiment analysis or need to capture subtle variations in sentiment intensity, Stanza may not be the most suitable tool for your use case.

However, if you prioritize language coverage and a simple, categorical sentiment classification, Stanza can be an excellent choice, especially when working with multilingual text data. Its broad language support and efficient sentiment classification can be beneficial in various applications, such as social media monitoring, customer feedback analysis, and opinion mining, where a coarse-grained sentiment classification may suffice.

Note : Takes longer time to install at first due to its language coverages, models and dependencies.

Here’s an example of how to perform sentiment analysis using Stanza in Python:

!pip install stanza
import stanza

# Initialize the Stanza pipeline
nlp = stanza.Pipeline('en', processors='tokenize,sentiment')

# Sample text
text = "The movie was great. The acting was mediocre, and the plot was unoriginal."

# Run sentiment analysis
doc = nlp(text)

# Get the sentiment scores
for sentence in doc.sentences:
    print(f"Sentence: {sentence.text}")
    print(f"Sentiment: {sentence.sentiment}")

Flair

Flair is a powerful NLP library built on top of PyTorch, offering a wide range of capabilities beyond sentiment analysis. It excels in tasks such as named entity recognition (NER), part-of-speech (POS) tagging, and provides specialized support for biomedical data analysis. One of Flair’s strengths lies in its high configurability, allowing users to train their own custom text analysis models tailored to their specific needs.

However, compared to some other sentiment analysis tools, Flair’s setup and usage for sentiment analysis might be slightly more involved. While it offers robust sentiment analysis capabilities, there is an additional step required to obtain the polarity score, which is not necessary with some of the other sentiment analysis libraries.

To perform sentiment analysis with Flair, you need to first initialize a sentiment model and then use the senti_score function to obtain the polarity score. This extra step might make Flair's sentiment analysis functionality slightly less accessible out-of-the-box compared to some other libraries that provide a more streamlined interface for sentiment analysis tasks.

Here’s an example of how to use Flair for sentiment analysis:

!pip install flair
from flair.models import TextClassifier
from flair.data import Sentence

# Load the pre-trained sentiment analysis model
sentiment_model = TextClassifier.load('en-sentiment')

# Sample text
text = "The movie was a disappointment. The acting was mediocre, and the plot was unoriginal."

# Create a Sentence object
sentence = Sentence(text)

# Perform sentiment analysis
sentiment_model.predict(sentence)

# Get the sentiment scores
sentiment_score = sentence.labels[0].score
sentiment_label = sentence.labels[0].value

# Print the results
print(f"Sentence: {text}")
print(f"Sentiment Score: {sentiment_score}")
print(f"Sentiment Label: {sentiment_label}")

Python Libraries for Sentiment Analysis — a study on what to choose

VADER (Valence Aware Dictionary Sentiment Reasoner)

TextBlob

Stanza

Flair

Written by Yash Raj Poddar