Photo credit : Pixabay

Exploring Sentiment Analysis

Understanding Text mining - Part 2

Lorna Maria A
Jan 19, 2018 · 4 min read


This article is part 2 of Understanding Text Mining.If you just landed here, Part 1 is available here.

One of the applications of text mining is sentiment analysis.In order for us to go ahead and carry out a sentiment analysis of our mined text,we are required to clean and prepare our data set as we saw in Part 1.

Understanding Sentiment Analysis

Sentiment Analysis:The study of extracted information to identify reactions, attitudes, context and emotions.As one of the applications of text mining, sentiment analysis exposes the attitudes in the mined text.

It is based on word polarities, it takes into account positive or negative words and neutral words are dismissed.

Table showing word polarity examples

Sentiment analysis is done based on lexicons. A lexicon in simpler terms is a vocabulary , say the English lexicon.In this context, a lexicon is a selection of words with the two polarities that can be used as a metric in sentiment analysis.

There are many different types of lexicons that can be used depending on the context of the data you are working with.There is also a possibility of creating a custom lexicon depending on how much customisation we would like to make with your data.

In this article,we shall make use of the syuzhet package.While there are a number of packages for sentiment analysis on CRAN,the syuzhet package is great to learn with because it is a combination of the most common lexicons like nrc, bing and afinn.

We also make use of ggplot2 to further visualise our results from the sentiment analysis.

How does Sentiment analysis work?

In simple terms,sentiment analysis is performed as an intersection of a term-document (built from the mined text ) and a lexicon of choice.

The first step is to have a term-document and a lexicon of your choice.
Then form an intersection between the two sets.

Hands-on with Sentiment analysis

Example one : This is a simple example where we extract emotions from a sentence.We load the sentence,split each word using the strsplit() function to form a character vector and use the get_nrc_sentiment() function from the syuzhet library.This function takes in new_sentence and compares it with the nrc emotion lexicon to return the scores as shown below.

Example two: This second example makes use of a TED talks data set that was downloaded from Kaggle under the name transcript.csv.It underwent cleaning using the tm package following the steps in part 1 of this article and was carried forward for sentiment analysis in this part 2.

Plot 1: Shows distinct emotions
Plot 2: Shows the combination of emotions under two polarities.

The code from this example can be accessed from this repository.


We have applied our sentiment analysis tricks on mined text to come up with an evident description of the emotions attached to text data.

This could be a whole project that can help you gain insights on how and when to talk to your audience, what they feel about a certain topic /product/service and what better way you can interact with them.

Now, go ahead choose an article/dataset /campaign that you want to try sentiment analysis on and follow the steps.
Happy Coding , I am always here to help <- @lornamariak

This story is published in The Startup, Medium’s largest entrepreneurship publication followed by 288,884+ people.

Subscribe to receive our top stories here.

The Startup

Medium's largest active publication, followed by +540K people. Follow to join our community.

Lorna Maria A

Written by

Data Science | Rstats | Life and Travel | Tech Meet-ups

The Startup

Medium's largest active publication, followed by +540K people. Follow to join our community.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade