NLP: Pre-trained Sentiment Analysis
Let’s evaluate some pretrained sentiment analysis tools provided in various Pythonic NLP libraries.
NLTK
import nltk
nltk.download('vader_lexicon')
from nltk.sentiment.vader import SentimentIntensityAnalyzer
sid = SentimentIntensityAnalyzer()
sid.polarity_scores(sentence)
NLTK’s Vader sentiment analysis tool uses a bag of words approach (a lookup table of positive and negative words) with some simple heuristics (e.g. increasing the intensity of the sentiment if some words like “really”, “so” or “a bit” are present).
The advantage of this approach is that sentences containing negated positive words (e.g. “not happy”, “not good”) will still receive a negative sentence sentiment (thanks to the heuristics to flip the sentiment of the word following a negation). Some simpler sentiment analysis tools will just take the average of the sentiments of the words and would miss subtle details like this
The disadvantage of this approach is that Out of Vocab (OOV) words that the sentiment analysis tool has not seen before will not be classified as positive/negative (e.g. typos).
Textblob
Textblob’s Sentiment Analysis works in a similar way to NLTK — using a bag of words classifier, but the advantage is that it includes Subjectivity Analysis too (how factual/opinionated a piece of text is)!
from textblob import TextBlob
TextBlob(sentence).sentiment
However, it doesn’t contain the heuristics that NLTK has, and so it won’t intensify or negate a sentence’s sentiment.
Flair
Flair’s sentiment classifier is based on a character-level LSTM neural network which takes sequences of letters and words into account when predicting
!pip3 install flair
import flair
flair_sentiment = flair.models.TextClassifier.load('en-sentiment')
s = flair.data.Sentence(sentence)
flair_sentiment.predict(s)
total_sentiment = s.labels
total_sentiment
The network has learnt to take negations into account
As well as intensifiers
But probably one of its biggest advantages is that it can predict a sentiment for OOV words that it has never seen before too (such as typos)
DeepMoji
This last one isn’t technically a sentiment analysis tool, because it predicts emojis for a sentence, however, I’ve included it here because this type of classifications demonstrates an awareness of sentiment (and even emotion) from the model.
!git clone https://github.com/huggingface/torchMoji
import os
os.chdir('torchMoji')
!pip3 install -e .
!python3 scripts/download_weights.py
!python3 examples/text_emojize.py --text f" {sentence} "
The classifier is also neural, but is slightly more sophisticated (a deep bi-LSTM with an attention mechanism)