Harvesting Data from Twitter and save it as .csv file

Sameer Sitoula
Analytics Vidhya
Published in
5 min readSep 4, 2020

Introduction

Social Media has become an important part of people’s lives and they can’t help themselves to not looking or following it. In this, we first create a twitter account by just logging in to www.twitter.com/signup and after that, we make a developer account on twitter from www.apps.twitter.com and enter all the details what is required. After the developers account for twitter is made we create one app on that developer’s account. The name for that app can be kept anything that the user wishes. After the app is created, inside that developers account we just select the token and secret key dashboard. Here, we get the signature keys like API key, API secret, access token key, and access token secret. We use this signature keys to scrap the data from the twitter using hashtag to search for the data[1].

Literature Survey

  1. Gathering information from Online Social Networks is a primordial step in many data science fields allowing researchers to work with different and more detailed datasets [2]. Although an important proportion of the scientific community uses the Twitter streaming API for collecting data, a limitation occurs when queries exceed rating intervals and time ranges [2].
  2. The ascent and improvement of Internet and the World Wide Web have given a worldwide system to sharing data and working together in confiding seeing someone. No sweat of openness, they have multiplied in our lives to a degree where the clients can get to/share data anyplace whenever [3].
  3. Online networking is ruling the universe of showcasing advancement these days. This makes it the ideal stage for any advertising advancement in any field. One of the most significant parts of online networking promoting is the two route plausibility of correspondence. [4].
  4. Today, social networks form an important communication platform. People located at geographically distinct locations can communicate with each other via various social networks [5].

Analytical Model

To explain about the analytical model the design diagram which shows the connection of Jupyter notebook and Twitter can be shown as below:-

Here in the above design diagram, we can see how the connection between Jupyter notebook and Twitter is made possible.
To show the analytic model more precisely the flow diagram can be shown as below:-

Here the flow diagram for the process of saving data when the connection between jupyter notebook and Twitter is shown.

To make that connection between them firstly the library called tweepy is imported in the jupyter notebook and essential code is written. Along with its various signature keys are also given as input.

Implementation

To implement this work first we did various works. This works can be listed as below:-

i. First, create a twitter account.

ii. Create developers account for twitter

iii. Create an app on developer account

iv. Save signature keys like API key, API secret, Access Token, and Access Token Secret.

v. Start Jupiter Notebook

vi. Create a workspace for python in jupyter notebook

vii. Import required libraries.

viii. Write the required codes.

ix. Give those signature keys as input

x. Provide certain hashtags to scrap data.

To explain it more precisely a few screenshots can be shown as below:-

Creating Developers Account

Confirmation from Twitter

Creating App on developers account

Required Signature Keys

The coding part can be shown as below:-

Applications

Various applications can be achieved by this work. Some of them are listed below:-

i. Data can be used for marketing and finance.

ii. For an academic purpose.

iii. For finding the sentiments associated with those tweets.

iv. For research purposes.

Outcomes and Results

The outcomes and result what I got by these steps are tremendous. The data is been successfully scrapped from twitter. Also, I was successful to make the required fields to show the data like tweet text, account who tweets, a hashtag used, number of retweets, etc. This data is saved as a .csv file and it can be used for various purposes. It can be used for marketing, financial purpose, research, academic purposes, etc. It can also be used to find out what exactly people are talking through their tweets and the sentiments associated with those tweets. Hence, in a nutshell, it can be said that this data which is scrapped from twitter can be used for various purposes and has some good applications.

References

[1]

Octoparse Jerry, “Twitte Scrapping , text mining and sentiment analysis,” https://hackernoon.com/twitter-scraping-text-mining-and-sentiment-analysis-using-python-b95e792a4d64, p. 10, 24 4 2019.

[2]

A. Harnandez Suarez, G. Sanchez Perez, V. Sanchez, A Web Scraping Methodology for Bypassing Twitter API, Madrid: https://arxiv.org/pdf/1803.09875.pdf, 2018.

[3]

M. B. Pooja Wadhwa, “SOCIAL NETWORKS ANALYSIS: TRENDS, TECHNIQUES AND FUTURE PROSPECTS,” IEEE, Delhi, 2014.

[4]

R. M. Suresh M, “Application of Social Media as a Marketing Promotion Tool-A Review,” Amrita School of Business, Coimbatore, 2017.

[5]

K. K. K. P. Madhura Kaple, “Viral Marketing for Smart Cities: Influencers in Social Network Communities,” 2017 IEEE Third International Conference on Big Data Computing Service and Applications, San Jose, CA, 2017.

--

--

Sameer Sitoula
Analytics Vidhya

Lecturer | Entrepreneur | General Manager and CFO of Tours and Travels Fim | Former Computer Science Engineer at Intel Corporation | Masters’ degree in Informat