Who’s Tweeting from the Oval Office?

Greg Rafferty
5 min readFeb 17, 2018

--

Building a Twitter bot

Look my Twitter bot!

This is Part 4 of a 4-part series. Check out the whole series! And be sure to follow @whosintheoval on Twitter to see who is actually tweeting on Donald Trump’s account!

  1. Who’s tweeting from the Oval Office?
  2. Choosing features
  3. How did the different models do?
  4. Let’s get this bot live on Twitter!

Who’s tweeting from the Oval Office?

I’ve built a Twitter bot @whosintheoval which retweets each of Donald Trump’s tweets and offers a prediction for whether the tweet was written by Trump himself or by one of his aides. Go ahead and read the first post in this series if you’re curious about the genesis of this little project, or read on to learn how I built the Twitter bot!

Model deployment

The best data science models in the world are worth nothing if they can’t be packaged in a format others can interpret. In this post, I’ll show how I deployed my model via a bot on Twitter. There are plenty of tutorials online discussing the various machine learning models I used in my model which is why I didn’t go into too much detail about them in my previous posts. However, when I started building this bot I found there to be a lack of clear information online about how to build a Twitter bot, so this post will be more technical than the previous ones and will include the code you’ll need to get a bot up and running.

For this tutorial, we’ll create a bot that watches @realDonaldTrump for any tweets, and as soon as something is posted the bot will ask @whosintheoval who posted it, Trump himself or one of his aides.

Prerequisites

  1. A Twitter account (got to https://twitter.com/signup to create one)
  2. Python
  3. Tweepy, a useful library for working with Twitter (pip install tweepy)

Get API access from Twitter

The first thing you’ll need to do if you’re building a Twitter bot is to get access to Twitter’s API. Visit apps.twitter.com and login with whichever Twitter account the bot will be posting on. Fill out the form and check all the necessary check boxes. Once you’re in, visit the tab “Keys and Access tokens” to generate a new access token which you’ll need to authenticate your application.

Storing credentials

You should never share these private keys so it’s best to hide them from any code you’ll post publicly. I created a folder called .env in the root folder of my project and for this tutorial I’ll assume you do the same. Create a new file in that folder called twitter_credentials.json and paste your keys and access tokens in the following format (replace the ALL CAPS WORDS with the values from your app’s dashboard on twitter):

Twitter provides a REST API for downloading data, but it’s not your best choice for real-time data. You’ll hit the rate limits very quickly if you’re constantly checking for new tweets. So for this bot, we’ll use the streaming API.

Initial setup

Now create a twitterbot.py file in your project folder (the same folder containing that .env folder we just created). Open twitterbot.py in your favorite text editor and import Tweepy and json. Tweepy, of course, is what we’ll use to interact with Twitter and json will allow us to read in those secret keys and access tokens. We’ll also import sleep so we can pause our bot temporarily if we hit Twitter’s rate limit:

OAuth authentication

Now, let’s load those our credentials and setup Tweepy to authenticate and connect to Twitter:

Identify the user to watch

For the next step, we need to know the Twitter ID of the user we’ll be monitoring, in this case @realDonaldTrump. Gettwitterid.com is a simple site that does one thing, which I think is obvious enough from its URL. Type in a Twitter username, and it outputs the corresponding User ID. For ‘realDonaldTrump’, this is 25073877. Let’s assign that to a variable in our code (as a string!).

Streaming with Tweepy

Tweepy has a super useful class for us called StreamListener. We’ll inherit from it and redefine the on_status function to perform our desired action. Let’s called our new class the TrumpStreamListener because later on when we start the streaming process, we’ll instruct the class to watch the account we’ve specified in the realDonaldTrump variable above. While we’re at it, let’s also redefine the on_error function. This function can perform an action when Twitter returns an error. In this case, we’ll watch for error 420, which means we’ve hit the rate limit. Every time your bot hits the rate limit, the amount of time before Twitter will let you back in increases exponentially, so we want to pause upon a 420 and attempt to reconnect in a minute, instead of hammering Twitter constantly.

Let’s walk through that on_status function. When we start streaming, we’ll instruct the bot to watch the account specified by the realDonaldTrump variable, but that also catches any retweets or mentions of him. We only want posts from Trump, so we need an if clause at the beginning of the function. The actual tweet we post will include a link to Trump’s tweet, so we assign the url variable which creates the link from the tweet data snatched by the stream listener and defined in the status variable. Next, we’ll compose the actual tweet, which will be “Who tweeted this, @whosintheoval? Trump or an aide?” followed by Trump’s original tweet, for example:

Finally, we’ll use Tweepy’s update_status function, which will post that tweet to our feed.

For my Twitter bot predicting the authors of @realDonaldTrump tweets, instead of calling api.update_status immediately, I defined a new post_tweet function which opened up a saved pickle file of my machine learning model, called the .predict and .predict_proba methods, then composed and posted the tweet. If you want to perform something a bit more complicated that posting a scripted tweet, this is where you’d code in that logic. You can find my complete code on my GitHub if you want to go into more detail on this.

Start the stream

Now we need to define a function which ensures that the stream doesn’t die in case of an error or momentary loss of internet connectivity. This function will automatically restart the stream if for any reason it’s interrupted. It also instructs the Stream Listener object to follow the account defined by variable realDonaldTrump, 25073877.

Kick things off

Finally, let’s set things in motion! These last commands will instantiate the class and call the start_stream function to turn the bot on.

And lastly, here’s all those snippets combined into a full program:

There you have it! If you run that program, it will continuously monitor Twitter for any activity regarding @realDonaldTrump. If that activity is a post by Trump’s account, then your bot will post a tweet to your own account asking the @whosintheoval account if Trump himself or an aide posted it.

Who Made This?

I’m Greg Rafferty, a data scientist in the Bay Area. You can check out the code for this project on my github and see what else I’ve been up to on my LinkedIn.

--

--