Natural Language Processing and Tweet Sentiment Analysis

Who’s down with NLP?

Cassandra Corrales
5 min readAug 2, 2018

If you’ve ever asked Siri, Alexa or Google a question (and received a response) then you’ve experienced the magic of Natural Language Processing.

Natural Language Processing, or NLP, is a field of study at the intersection of computer science, artificial intelligence, and linguistics. Through NLP, computers are able to extract meaning from natural human language.

As innate as language may seem to us, teaching a computer to understand language presents many challenges. As it turns out, the way we speak is highly ambiguous. Different phrases can have the same meaning (how’s it going? vs. what’s up?), just as one word can have multiple meanings (mouse 🐭 vs. mouse 🖱). Computers don’t like ambiguity. So how does Natural Language Processing work?

source: http://thinkdifferent.typepad.com/edulog/artificial_intelligence/
Words can have multiple meanings.

Natural Language Processing once relied on human analysis. This involved hand-codings sets of rules to get machines to “learn” language patterns. Today, NLP relies on machine learning algorithms to make statistical inferences from text. Using this model, the more text a computer processes, the more language rules it will learn and the more accurate it will be.

NLP Tasks

There are several Natural Language Processing tasks that focus on dissecting and extracting meaning from a particular language attribute. Some examples of NLP tasks include:

  • Separating text into sentences, words and morphemes
  • Tagging parts of speech
  • Finding the meaning of each word within a given context
  • Translating text from one human language to another
  • Converting database information into human readable language
  • Answering questions asked in human readable language
  • Analyzing words for sentiment
  • Converting spoken language into written text

NLP in action

Let’s take a look at what some of the NLP tasks above actually look like. In this section, we’re going to make sense of some text through the lens of an NLP program. We’ll be doing this through Google’s Natural Language API in Ruby.

The Natural Language API breaks up text into its constituent words and punctuation (called tokens) and then provides information on each part. You can use the API to perform the following tasks on a chunk of text:

  • Syntax analysis- identify parts of speech
  • Entity recognition- label entities by type (person, location, event, etc.)
  • Sentiment analysis- get the overall sentiment of a block of text
  • Content classification- classify documents into predefined categories

Upon reviewing the Natural Language API docs, I became most interested in sentiment analysis. Let’s delve deeper into how this feature works.

Sentiment Analysis

The sentiment score is a numerical interpretation of the overall emotional leaning of the text. Score values range from -1 (negative sentiment) to 1.0 (positive sentiment).

Here’s the code to analyze the sentiment of a string of text:

I switched the text_content variable from the code above to the two strings below just to test things out. I bet you can guess which sentence got the higher (and therefore more emotionally positive) score.

Here are the results:

Analyzing a single sentence is cool and all, but what can this sentiment analysis reveal about larger texts?

Getting Sentiment Analysis Scores for Top Twitter Accounts

For the next step, I combined all of a person’s tweets into one file, and then ran the sentiment analysis API on this text. In order to achieve this, I used the Twitter API along with the Twitter Ruby gem.

I researched the twitter accounts with the most followers, and picked 5 accounts I would pull tweet data from (highlighted below).

Top 20 most followed Twitter Accounts, with the 5 accounts I will be analyzing for sentiment highlighted. Source: https://en.wikipedia.org/wiki/List_of_most-followed_Twitter_accounts

Here’s an example code snippet you can use to return a user’s (in this case, Katy Perry’s) last 50 tweets. Just make sure you apply for a twitter developer account and install the twitter gem first!

gem install twitter

After getting the list of tweet text from each of the 5 twitter accounts I selected (katyperry, BarackObama, TheEllenShow, cnnbrk and realDonaldTrump), I saved each chunk of text in a separate file. Then, I loaded each tweet file into the text_content variable in the Natural Language Processor file (above).

And now, for the results…

Here’s the sentiment score for the last 50 tweets of each user I analyzed:

sentiment analysis scores for last 50 tweets per user

Conclusion

While I was surprised by some of the tweet analysis results, it was interesting to see a machine quantify something as subjective and abstract as emotion.

In summary, Natural Language Processing is an ever growing field that allows computers to make sense of natural human language. A number of interesting Natural Language Processing APIs exist now, so you can easily test some NLP functionalities for yourself. Considering the plethora of readily available text information, the possibilities are endless!

Sources:

--

--