Accessing the Twitter API with Tweepy

Svideloc
Analytics Vidhya
Published in
4 min readFeb 14, 2020

--

Twitter collects 12 terabytes of data per day, which is roughly 4.3 petabytes per year, which is a lot of data. 456,000 tweets per minute in fact. Not all of this data is available to the public, but it turns out that a surprising amount is. There are a few different ways to access this data, but the one that I found to be the simplest for me is accessing the Twitter API using the package Tweepy.

Through this package, user data or specific tweet data can be easily accessed and brought into python for easy analysis. This post will serve to walk you through the steps of what you need to do in order to access the Twitter API.

1. Apply for Access to the Twitter API & Getting Started

Prior to using the Twitter API, you will need to be granted access. Follow these steps.

  1. Create a Twitter account (if you don’t already have one).
  2. Head to the developer Twitter site and apply for a developer account. It took Twitter about a day and a half to approve my application.
  3. Once approved, you will have to set up your own Dev Environment, and then you will be able to get started with accessing the API.

2. Get Tokens & Secret Keys/Access them in Python

In order to access the API, you will need 4 different tokens/secret keys which you can find in the “Keys and tokens” tab in your developer environment.

Let’s take a look at this in code by importing the necessary libraries. I save my tokens in JSON files on my computer in order to keep them hidden.

import json # Define a function to open the json
def get_keys(path):
with open(path) as f:
return json.load(f)
# Using the function to open and load all keys in that file
access_token_get = get_keys("/home/twitter_access_token.json")
access_secret_get = get_keys("/home/twitter_secret_access_token.json")
consumer_key_get = get_keys("/home/twitter_api_consumer_key.json")
consumer_secret_get = get_keys("/home/twitter_api_secret_consumer_key.json")
# Setting tokens/keys as variables
ACCESS_TOKEN = list(access_token_get.values())[0]
ACCESS_SECRET = list(access_secret_get.values())[0]
CONSUMER_KEY = list(consumer_key_get.values())[0]
CONSUMER_SECRET = list(consumer_secret_get.values())[0]

Now each of the tokens/keys are stored in variables. We are now ready to start using Tweepy to get some data.

3. Use Tweepy to Grab Data

Tweepy is a free Twitter library that provides access to the entire twitter RESTful API methods. I would strongly recommend going to the documentation, as different methods access different types of data, and depending on your needs, one may be better to use over another. For today’s purposes, I am going to demonstrate getting a specific user’s data using the “get_user” method. Specifically, let’s look at Donald Trump's twitter for fun.

#Tweepy Package
import tweepy
#Connecting to the API
auth = tweepy.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
auth.set_access_token(ACCESS_TOKEN, ACCESS_SECRET)
api = tweepy.API(auth)
#Accessing my account information
user = api.get_user(screen_name = 'realDonaldTrump')
print(user)

And in just a few lines of code, you have quite a bit of user data. Above is a sample of some of the user data that is captured when using the ‘get_user’ method. Now let’s parse through a few parts of the data. Tweepy has made it really easy to grab specific data in this API — let’s look at a few:

print('Total # of Statuses by this User:', user.statuses_count)
print("User's Bio:", user.description)
print("Date Account Created:", user.created_at)

Great, but now let’s look at how to get the user's actual tweet data using the ‘user_timeline’ method. You just need two arguments for this, the user and the number of tweets you want. Let’s grab Trump’s 10 most recent tweets:

user_tweets = api.user_timeline(screen_name = 'realDonaldTrump', count = 10)#Print the 5th tweet
print('Tweet 5:', user_tweets[5].text)

There you have it. Now you should be able to access Twitter user data as well as specific tweets from users. In order to get this data into easily eval methods, you will have to do some loops, but overall this should help you at least gain access to Twitter data using the package Tweepy.

--

--