Analysis of Twitter Data Using R - Part 1 : Twitter Authentication

Twitter is a popular social platform for expressing our emotions, activities and also for getting a massive amount of information around the web. In addition, twitter can also be an amazing open mine for text and social web analysis. Among the different softwares that can be used to analyze twitter, R offers a wide variety of options to do lots of interesting things.

R is a programming language and software environment intended for deep statistical computing and graphics. It is open source and available across different platforms, e.g., Windows, Mac, Linux. It is now used in a variety of applications including visualizations and data mining. For Detailed info about R language click here.

In the first part of the article,we will understand how to get an authencation from Twitter to extract data from it. Then in part 2 we will see how to build word cloud using the Twitter feed and then in third part we will learn how to perform Sentiment Analysis on the Twitter data.

Step 1. Twitter Authentication for extracting tweets

Creating a Twitter Application

  1. First step to perform Twitter Analysis is to create a twitter application. This application will allow you to perform analysis by connecting your R console to the twitter using the Twitter API. The steps for creating your twitter applications are:

Click here and login by using your twitter account.

·2. Now you can see your Profile picture in the upper right corner and a drop-down menu. In this menu you can find “My Applications”. Navigate to “My Applications” in the upper right hand corner.

3. · Click on Create a new application.

4. · Give your application a name. Give a description for application in few words, provide your website’s URL or your blog address . Leave the Callback URL blank for now. Complete other formalities and create your twitter application.

5. · Scroll down and click on “Create my access token” button.

6. · Once, all the steps are done, the created application will show as below.

I have removed the Access Token and Consumer keys from the above image which were provided by twitter for Security purpose of my app..

· Please note the Consumer key , Consumer Secret, Access Token and Access Token Secret numbers as they will be used in R later.

Once the Twitter Application is ready we can now move forward towards programming in R to extract data from Twitter.

Step 2. Install and Load R Packages

R comes with a standard set of packages. For the extracting tweets and getting authentication we will need the following packages:

Twitter : Provides an interface to the Twitter web API.

ROAuth : Provides an interface to te OAuth 1.0 specification ,allowing users to authenticate via OAuth to the server of their choice.

Stringr : Fast and friendly string manipulation.

Plyr : Set of clean and consistent tools that implement the split-apply-combine pattern in R.

Let’s start by installing and loading all the required packages.

Open your R console and start by loading the following libraries.

# Load the required R libraries
install.packages("RColorBrewer")
install.packages("tm")
install.packages("wordcloud")
install.packages('base64enc')
install.packages('ROAuth')
install.packages('plyr')
install.packages('stringr')
install.packages('twitteR')
library(RColorBrewer)
library(wordcloud)
library(tm)
library(twitteR)
library(ROAuth)
library(plyr)
library(stringr)
library(base64enc)

Step 3. Getting a curl Certification

Download the curl certificate and save it in the folder of your choice.

download.file(url="http://curl.haxx.se/ca/cacert.pem",destfile="cacert.pem")

Step 4. Setting up the Certification for Twitter

# Set constant requestURL
requestURL <- "https://api.twitter.com/oauth/request_token"
# Set constant accessURL
accessURL <- "https://api.twitter.com/oauth/access_token"
# Set constant authURL
authURL <- "https://api.twitter.com/oauth/authorize"

Step 5. Authorization for the Twitter account

setup_twitter_oauth(consumerKey,
consumerSecret,
accessToken,
accessTokenSecret)

· In the consumerKey field paste the access token you got for your twitter developer application.

consumerKey <- "xxxx"

· In the consumerSecret field paste the access token you got for your twitter developer application.

consumerSecret <- "xxxx"

· In the accessToken field paste the access token you got for your twitter developer application.

accessToken <- "xxxx"

· In the accessTokenSecret field paste the access token you got for your twitter developer application.

accessTokenSecret <- "xxxx"

Step 6. Extract Tweets using R

Now we are ready to extract tweets form Twitter .We set two variables, one for the search string, which could be a hashtag or user mention, and the second variable is the number of tweets we want to extract for analysis.

accessToken <- "xxxx"
Objectname <- searchTwitter(searchString, n=no.of tweets, lang=NULL)

Where,

searchString : Search query to issue to twitter. Use “+” to separate query terms.

n : The maximum number of tweets to return.

lang : If not NULL, restricts tweets to the given language, given by an ISO 639–1 code

For more information about searchTwitter just type ?searchTwitter in R Console.

So here i will extract tweets from Instagram Twitter Handle.

What is Instagram ?

Instagram is an online mobile photo-sharing, video-sharing, and social networking service that enables its users to take pictures and videos, and share them either publicly or privately on the app, as well as through a variety of other social networking platforms, such as Facebook, Twitter, Tumblr, and Flickr.

Why I decided to extract Instagram tweets ?

On May 11 2016, Instagram changed its logo, leading to a significant amount of discussion on social media. They updated its icon and app design. Inspired by the previous app icon, the new one represents a simpler camera and the rainbow lives on in gradient form.

Opinions spanned the gamut about the change.Everyone on social media freaked out about Instagram’s new logo.

So I decided to extract the people’s review and prepare a word cloud of the words and perform sentiment analysis on that data.

The hashtag which was trending on twitter was #instagramlogo .

insta <- searchTwitter(‘#instagramlogo’ ,n=3000,lang = ‘en’)

Check the length of tweets

length (insta)

[1] 3000

Reading out first 3 tweets

[[1]]
[1] "insomniacbeast: RT @Iamshonali: Whoever approved the new #instagramlogo,your sense of design is almost hurtful.It's like turning a gourmet restaurant to a…"

[[2]]
[1] "DaaruDesi: No.1 tip to app developers on how to keep their social media app popular?\n\n- Don't sell out to Facebook\n\n#instagramlogo"

[[3]]
[1] "Poeseeeeeee: RT @lovecolorbar: Catch the trend of Colors now.\n#LoveColorBar #InstagramUpdate #InstagramLogo https://t.co/SMThck35D3"

Extract tweets from any Twitter Profile (Let’s extract tweets of a Barack Obama’s twitter handle)

tweet=userTimeline(“@BarackObama”,n=100)

To get tweets from your home timeline.

 homeTimeline (n=15)

Get your tweets in which you were tagged in

 mentions (n=15)

Summary

So in this article we learnt how to get authentication from twitter to extract tweets. Analysis of tweets from Twitter can be useful from business perspective for the companies to get reviews about their new product or existing ones from their costumer base.

This analysis when coupled with visualizations becomes that much more powerful. So in next article we learn how to create Word Cloud using the tweets.

Happy Learning :)