A Simple Guide to Scrape Tweets Using Python

This step-by-step guide will teach you how to easily scrape tweets from Twitter’s API.

Photo Credit: B. Nonnecke

The spread of mis- and disinformation through social media poses significant risks to public safety and democracy. Understanding how these campaigns are initiated and spread is critical to stemming their negative effects. Twitter has created an API, an application programming interface, that allows users to pull tweets and other data from Twitter’s servers for research.

At the CITRIS Policy Lab, headquartered at UC Berkeley, we use Twitter data to research the effects of automated accounts in spreading disinformation, harassment, and political divisiveness on contentious political issues.

If you’re interested in studying Twitter, this step-by-step guide will lead you through the process of scraping tweets and other relevant data from Twitter’s API. The instructions in this guide are tailored for Mac users, but will also be helpful for PC users.

Step 1. Apply for a Twitter Developer License & Obtain Credentials

You’ll need to have a Twitter developer license in order to scrape tweets. You can apply here. It may take a day or two to get approval.

Once you’re approved, log into your account at developer.twitter.com and go to Apps under your name on the top right-hand of the website. Click Create an App on the top right-hand of the website. Fill out the form to create your app.

After you create your app, go to Keys and Tokens in the menu. Your API Key and API Secret Key will already be listed. Click Generate to get your Access Token and Access Token Secret. Do not share these with others. They are made for your use only.

Accessing your Twitter Access Keys and Tokens.

Copy Your API Key, API Secret Key, Access Token, and Access Token Secret. You will use these in your Python code.

Step 2. Install Python

This tutorial uses scripts written in the Python programming language, so you’ll need to have Python installed.

If you use a Mac or Linux, you should have Python pre-installed, but it may not be the right version of Python. To see what version of Python you have installed, open up the Terminal application and type python -V. Press Enter. On a Mac, you can open the Terminal by clicking command spacebar and typing Terminal into the search bar. If you have Python 3.0.0 or higher, you should be okay, otherwise go to python.org/downloads and follow the instructions to download and install Python 3.8.2 for Mac.

Windows does not come with Python pre-installed. You can download and install Python 3.8.2 at python.org/downloads.

Step 3. Install Sublime Text or Another Code Editor

If you don’t already have a preferred code editor, I recommend you download and install Sublime Text at sublimetext.com. Code editors are often easier to work with than simple text editors because they do “syntax highlighting” and have other helpful features. You can use a plain text editor like Gedit or TextEdit if you prefer.

Step 4. Create a Folder & Save Python File in Folder

Create a folder for this project. For example, you can name your folder “ScrapeTweets.”

Open Sublime Text and create a new file (File → New File). Make sure the file you create in Sublime Text is using the Python syntax. You can check this by going to View → Syntax → Python. Save the file as .py inside the “ScrapeTweets” folder.

Step 5. Open Terminal & Set Up Virtual Environment

Right click on the “ScrapeTweets” folder you created and select New Terminal at Folder. To right click on a Mac, press control while clicking on the folder.

You can check that you’re in the right location within your Terminal window by typing pwd, which means “print working directory.” The last element of the path that is printed should be “ScrapeTweets” or whatever name you gave your folder that you created.

To set up a virtual environment, type python3 -m venv venv in the Terminal. Press Enter.

Type source venv/bin/activate in the Terminal. Press Enter. You have now activated your virtual environment. It should now say (venv) on the left side of your Terminal window.

Type pip3 install tweepy in the Terminal. Press Enter.

Type pip3 install pandas in the Terminal. Press Enter.

DO NOT CLOSE YOUR TERMINAL WINDOW.

Step 6. Python

Copy the Python code shared via GitHub into the Sublime Text file that you set up earlier and saved inside the “ScrapeTweets” folder you created and opened your Terminal from.

This is how your Python code should appear in Sublime Text.

Paste in your Twitter Tokens and Keys into the Sublime Text file. Save your file.

You’ll need to update the file save path andfile nameat the bottom of the code in your Sublime Text file.

You can identify the appropriate file save path to enter by typing pwd into the Terminal and pressing Enter. Copy and paste the file save path that was generated and add a name for the file you will create into your Sublime Text file.

Don’t forget to update your file save path and file name.

SAVE YOUR FILE in Sublime Text. You must save your Python file after every change in order for the updates to run through the Terminal.

Step 7. Scraping Tweets

Go to your Terminal. Type python filename.csv. Press Enter. Don’t forget that you need to replace “filename” with the file name you gave your Sublime Text file.

Your code should run and create a new CSV file in your folder with the name you gave it. The CSV file should contain data for the 7 fields specified in the code: (1) created_at, (2) tweet_id, (3) tweet_text, (4) screen_name, (5) name, (6) account_creation_date, and (7) urls.

Woot, Woot! You Did It!

Libby VanderPloeg

Terminal Cheat Sheet & Extras

To stop something running: Ctrl c

To see your file directory: pwd

To set up a virtual environment: python3 -m venv venv then source venv/bin/activate

To install Tweepy: pip3 install tweepy
Tweepy is a Python library for accessing the Twitter API.

To install Pandas: pip3 install pandas
Pandas is a software library written for Python for data manipulation and analysis.

To see a list of the files in your directory: ls

To autocomplete a file name: Type the first letter of the file name and press tab

To repeat the last command you ran: Click the up arrow key

CITRISPolicyLab

TECHNOLOGY POLICY RESEARCH & ENGAGEMENT IN THE INTEREST OF…

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store