Simplifying Sentiment Analysis using VADER in Python (on Social Media Text)

An easy to use Python library built especially for sentiment analysis of social media texts.

Parul Pandey
Sep 23, 2018 · 8 min read
PC:Pixabay/PDPics

“If you want to understand people, especially your customers…then you have to be able to possess a strong capability to analyze text. “ — Paul Hoffman, CTO:Space-Time Insight

The 2016 US Presidential Elections were important for many reasons. Apart from the political aspect, the major use of analytics during the entire canvassing period garnered a lot of attention. During the elections, millions of Twitter data points, belonging to both Clinton and Trump, were analyzed and classified with a sentiment of either positive, neutral, or negative. Some of the interesting outcomes that emerged from the analysis were:

This is the power that sentiment analysis brings to the table and it was quite evident in the U.S elections. Well, the Indian Elections are around the corner too and sentiment analysis will have a key role to play there as well.


What is Sentiment Analysis?

source

Sentiment Analysis, or Opinion Mining, is a sub-field of Natural Language Processing (NLP) that tries to identify and extract opinions within a given text. The aim of sentiment analysis is to gauge the attitude, sentiments, evaluations, attitudes and emotions of a speaker/writer based on the computational treatment of subjectivity in a text.

Why is sentiment analysis so important?

Businesses today are heavily dependent on data. Majority of this data however, is unstructured text coming from sources like emails, chats, social media, surveys, articles, and documents. The micro-blogging content coming from Twitter and Facebook poses serious challenges, not only because of the amount of data involved, but also because of the kind of language used in them to express sentiments, i.e., short forms, memes and emoticons.

Sifting through huge volumes of this text data is difficult as well as time-consuming. Also, it requires a great deal of expertise and resources to analyze all of that. Not an easy task, in short.

Sentiment Analysis is also useful for practitioners and researchers, especially in fields like sociology, marketing, advertising, psychology, economics, and political science, which rely a lot on human-computer interaction data.

Sentiment Analysis enables companies to make sense out of data by being able to automate this entire process! Thus they are able to elicit vital insights from a vast unstructured dataset without having to manually indulge with it.

Why is Sentiment Analysis a Hard to perform Task?

Though it may seem easy on paper, Sentiment Analysis is actually a tricky subject. There are various reasons for that:

“The intent behind the movie was great, but it could have been better”.

The above sentence consists of two polarities, i.e., Positive as well as Negative. So how do we conclude whether the review was Positive or Negative?

“The best I can say about the movie is that it was interesting.”

Here, the word ’interesting’ does not necessarily convey positive sentiment and can be confusing for algorithms.

These are few of the problems encountered not only with sentiment analysis but with NLP as a whole. In fact, these are some of the Open-ended problems of the Natural Language Processing field.

VADER Sentiment Analysis

VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon and rule-based sentiment analysis tool that is specifically attuned to sentiments expressed in social media. VADER uses a combination of A sentiment lexicon is a list of lexical features (e.g., words) which are generally labelled according to their semantic orientation as either positive or negative.

VADER has been found to be quite successful when dealing with social media texts, NY Times editorials, movie reviews, and product reviews. This is because VADER not only tells about the Positivity and Negativity score but also tells us about how positive or negative a sentiment is.

It is fully open-sourced under the MIT License. The developers of VADER have used Amazon’s Mechanical Turk to get most of their ratings, You can find complete details on their Github Page.

methods and process approach overview

Advantages of using VADER

VADER has a lot of advantages over traditional methods of Sentiment Analysis, including:

The source of this article is a very easy to read paper published by the creaters of VADER library.You can read the paper here.

Enough of talking. Let us now see practically how does VADER analysis work for which we will have install the library first.

Installation

The simplest way is to use the command line to do an installation from [PyPI] using pip. Check their Github repository for the detailed explanation.

> pip install vaderSentiment

Once VADER is installed let us call the SentimentIntensityAnalyser object,

from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzeranalyser = SentimentIntensityAnalyzer()

Working & Scoring

Let us test our first sentiment using VADER now. We will use the polarity_scores() method to obtain the polarity indices for the given sentence.

def sentiment_analyzer_scores(sentence):
score = analyser.polarity_scores(sentence)
print("{:-<40} {}".format(sentence, str(score)))

Let us check how VADER performs on a given review:

sentiment_analyzer_scores("The phone is super cool.")The phone is super cool----------------- {'neg': 0.0, 'neu': 0.326, 'pos': 0.674, 'compound': 0.7351}

Putting in a Tabular form:

compound score metric

read here for more details on VADER scoring methodology.

VADER analyses sentiments primarily based on certain key points:

See how the overall compound score is increasing with the increase in exclamation marks.

Handling Emojis, Slangs, and Emoticons.

VADER performs very well with emojis, slangs, and acronyms in sentences. Let us see each with an example.

print(sentiment_analyzer_scores('I am 😄 today'))
print(sentiment_analyzer_scores('😊'))
print(sentiment_analyzer_scores('😥'))
print(sentiment_analyzer_scores('☹️'))
#OutputI am 😄 today---------------------------- {'neg': 0.0, 'neu': 0.476, 'pos': 0.524, 'compound': 0.6705}😊--------------------------------------- {'neg': 0.0, 'neu': 0.333, 'pos': 0.667, 'compound': 0.7184}😥--------------------------------------- {'neg': 0.275, 'neu': 0.268, 'pos': 0.456, 'compound': 0.3291}☹️-------------------------------------- {'neg': 0.706, 'neu': 0.294, 'pos': 0.0, 'compound': -0.34}💘--------------------------------------- {'neg': 0.0, 'neu': 1.0, 'pos': 0.0, 'compound': 0.0}
print(sentiment_analyzer_scores("Today SUX!"))
print(sentiment_analyzer_scores("Today only kinda sux! But I'll get by, lol"))
#outputToday SUX!------------------------------ {'neg': 0.779, 'neu': 0.221, 'pos': 0.0, 'compound': -0.5461}Today only kinda sux! But I'll get by, lol {'neg': 0.127, 'neu': 0.556, 'pos': 0.317, 'compound': 0.5249}
print(sentiment_analyzer_scores("Make sure you :) or :D today!"))Make sure you :) or :D today!----------- {'neg': 0.0, 'neu': 0.294, 'pos': 0.706, 'compound': 0.8633}

We saw how VADER can easily detect sentiment from emojis and slangs which form an important component of the social media environment.

Conclusion

The results of VADER analysis are not only remarkable but also very encouraging. The outcomes highlight the tremendous benefits that can be attained by the use of VADER in cases of micro-blogging sites wherein the text data is a complex mix of a variety of text.


Additional Resources

Here are some additional resources worth mentioning for in-depth Sentiment Analysis

Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data…

Parul Pandey

Written by

Data Science+Community+Evangelism @H2O.ai

Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data Science professionals. We are building the next-gen data science ecosystem https://www.analyticsvidhya.com

Parul Pandey

Written by

Data Science+Community+Evangelism @H2O.ai

Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data Science professionals. We are building the next-gen data science ecosystem https://www.analyticsvidhya.com

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store