Natural Language Processing and Tweet Sentiment Analysis

Who’s down with NLP?

5 min readAug 2, 2018

If you’ve ever asked Siri, Alexa or Google a question (and received a response) then you’ve experienced the magic of Natural Language Processing.

Natural Language Processing, or NLP, is a field of study at the intersection of computer science, artificial intelligence, and linguistics. Through NLP, computers are able to extract meaning from natural human language.

As innate as language may seem to us, teaching a computer to understand language presents many challenges. As it turns out, the way we speak is highly ambiguous. Different phrases can have the same meaning (how’s it going? vs. what’s up?), just as one word can have multiple meanings (mouse 🐭 vs. mouse 🖱). Computers don’t like ambiguity. So how does Natural Language Processing work?

source: http://thinkdifferent.typepad.com/edulog/artificial_intelligence/

Natural Language Processing once relied on human analysis. This involved hand-codings sets of rules to get machines to “learn” language patterns. Today, NLP relies on machine learning algorithms to make statistical inferences from text. Using this model, the more text a computer processes, the more language rules it will learn and the more accurate it will be.

NLP Tasks

There are several Natural Language Processing tasks that focus on dissecting and extracting meaning from a particular language attribute. Some examples of NLP tasks include:

Separating text into sentences, words and morphemes
Tagging parts of speech
Finding the meaning of each word within a given context
Translating text from one human language to another
Converting database information into human readable language
Answering questions asked in human readable language
Analyzing words for sentiment
Converting spoken language into written text

NLP in action

Let’s take a look at what some of the NLP tasks above actually look like. In this section, we’re going to make sense of some text through the lens of an NLP program. We’ll be doing this through Google’s Natural Language API in Ruby.

The Natural Language API breaks up text into its constituent words and punctuation (called tokens) and then provides information on each part. You can use the API to perform the following tasks on a chunk of text:

Syntax analysis- identify parts of speech
Entity recognition- label entities by type (person, location, event, etc.)
Sentiment analysis- get the overall sentiment of a block of text
Content classification- classify documents into predefined categories

Upon reviewing the Natural Language API docs, I became most interested in sentiment analysis. Let’s delve deeper into how this feature works.

Sentiment Analysis

The sentiment score is a numerical interpretation of the overall emotional leaning of the text. Score values range from -1 (negative sentiment) to 1.0 (positive sentiment).

Here’s the code to analyze the sentiment of a string of text:

I switched the text_content variable from the code above to the two strings below just to test things out. I bet you can guess which sentence got the higher (and therefore more emotionally positive) score.

Here are the results:

Analyzing a single sentence is cool and all, but what can this sentiment analysis reveal about larger texts?

Getting Sentiment Analysis Scores for Top Twitter Accounts

For the next step, I combined all of a person’s tweets into one file, and then ran the sentiment analysis API on this text. In order to achieve this, I used the Twitter API along with the Twitter Ruby gem.

I researched the twitter accounts with the most followers, and picked 5 accounts I would pull tweet data from (highlighted below).

Top 20 most followed Twitter Accounts, with the 5 accounts I will be analyzing for sentiment highlighted. Source: https://en.wikipedia.org/wiki/List_of_most-followed_Twitter_accounts

Here’s an example code snippet you can use to return a user’s (in this case, Katy Perry’s) last 50 tweets. Just make sure you apply for a twitter developer account and install the twitter gem first!

gem install twitter

After getting the list of tweet text from each of the 5 twitter accounts I selected (katyperry, BarackObama, TheEllenShow, cnnbrk and realDonaldTrump), I saved each chunk of text in a separate file. Then, I loaded each tweet file into the text_content variable in the Natural Language Processor file (above).

And now, for the results…

Here’s the sentiment score for the last 50 tweets of each user I analyzed:

sentiment analysis scores for last 50 tweets per user

Conclusion

While I was surprised by some of the tweet analysis results, it was interesting to see a machine quantify something as subjective and abstract as emotion.

In summary, Natural Language Processing is an ever growing field that allows computers to make sense of natural human language. A number of interesting Natural Language Processing APIs exist now, so you can easily test some NLP functionalities for yourself. Considering the plethora of readily available text information, the possibilities are endless!

Sources:

Cloud Natural Language | Cloud Natural Language | Google Cloud

AutoML Natural Language enables you to create custom machine learning models to classify content into a custom set of…

cloud.google.com

Docs

Staying informed about changes to our APIs is important for those developing on the platform and can be critical to…

developer.twitter.com

Learn to Use the Twitter API with Ruby

Do you want to learn how to write a Twitter application using Ruby? Then you are in the right place! In this post I…

www.rubyguides.com

NLP - Wikipedia

This disambiguation page lists articles associated with the title NLP. If an internal link led you here, you may wish…

en.wikipedia.org

List of most-followed Twitter accounts - Wikipedia

This list contains the top 50 accounts with the largest number of followers on the social media platform Twitter. As of…