Fun with Twitter Bots: Part-1

Creating a twitter app, storing tweets, and analyzing tweets: https://github.com/kimoyerr/twitterbot

Krishna Yerramsetty
6 min readFeb 18, 2019

I’ve been interested in Natural Language Processing (NLP) techniques for a while now. I finally decided to take the plunge and get my hands dirty. In this part, I will create a twitter bot to automatically retweet based on some search criteria. In the subsequent parts, I will store these tweets and perform (hopefully) interesting analyses.

To get started, I relied heavily on several excellent blog posts on this subject:

I tried to borrow what worked for me from these posts, and create my own version of a twitter bot.

The above is a simplified depiction of the workflow. In the rest of this post, I will add more details on how I got the above workflow to work. I will not go into all the gory details, but I will point out things that were counter-intuitive to me, and hopefully help others get through those stumbling blocks quicker.

Setting up your Twitter app

If you do not have a twitter account, this is where you would create one. I’ve been late to the twitter game, and its an interesting medium. I agree with Matt Cutts :)

“When you’ve got 5 minutes to fill, Twitter is a great way to fill 35 minutes —

Once you have your account, go to https://developer.twitter.com/en/apps and create a new app. Twitter will ask you to fill in some details. You are also required to add in a website for the app. I used https://dashboard.heroku.com/apps/ because that is where my app will be hosted. Make sure to also describe how you will use this app.

Get your Twitter app’s API keys and access tokens

There are two kinds of “passwords” you need to connect to your Twitter app. One is the Consumer API key and the other is the access token (and access token secret). This Stack Overflow answer briefly explains why you need both to be able to read tweets and write tweets using your app:

Please make sure to not make your keys and tokens public, and also not make them part of your git repository if you intend to make the repo public.

Setting up Python scripts for automated retweeting

Here I will walk through the steps for writing a simple script, that retweets (from my personal account) one recent public tweet based on keyword search every few minutes.

Install Tweepy

I used the easy to use tweepy library functions to get the public tweets, search these tweets, and then to retweet. To install tweepy in a conda environment follow these steps:

conda create -n twitterbot
conda install -c conda-forge tweepy
# To activate this environment
conda activate twitterbot
# To deactivate this environment
conda deactivate

Python script for running the bot on your local environment

Next, create a simple script to

  1. Print the 20 recent tweets from my timeline, and
  2. Search for tweets with a specific keyboard and then retweet one of these tweets from my timeline.

Here is the code I used to do this. Please fill in your own app’s keys and tokens to run this, and remember not to publish your keys and tokens 🔐🔐🔐

# Other Libs
import tweepy
# Get the access keys and tokens from the Heroku environment
API_KEY = 'your_app's_consumer_api_key'
API_SECRET_KEY = 'your_app's_consumer_api_secret_key'
ACCESS_TOKEN = 'your_app's_access_token'
ACCESS_TOKEN_SECRET = 'your_app's_access_token_secret'
# This is the meat of the script that drives the twitterbot
auth = tweepy.OAuthHandler(API_KEY, API_SECRET_KEY)
auth.set_access_token(ACCESS_TOKEN, ACCESS_TOKEN_SECRET)
api = tweepy.API(auth)
public_tweets = api.home_timeline()
for tweet in public_tweets:
print(tweet.text)
search = "crispr"
numberOfTweets = 1
for tweet in tweepy.Cursor(api.search, search).items(numberOfTweets):
try:
tweet.retweet()
print('Retweeted the tweet')
except tweepy.TweepError as e:
print(e.reason)
except StopIteration:
break

If you are curious to know what “crispr” is, check this video by Jennifer Doudna, one of the inventors of this awesome technology:

Now, you can run the above script locally on your computer or you can run it on a remote server. There are several options to run your apps remotely, and these solutions are generally referred to as Platform as a Service (PaaS) solutions. Good examples of this category are Heroku, Google app engine, and AWS Elastic Bean Stalk. They provide a simplified framework to get your app up and running without you having to worry about setting up the infrastructure, operating system, and other middleware.

Running your app on Heroku

I used Heroku here since I haven’t used it before, and so far has been quite an easy experience.

Heroku set up

Create a Heroku account and choose a free account. Next, create an app and name it whatever you want. Now you have a basic app ready to be deployed. Heroku also comes with a command line interface (CLI) that you can use to connect to and control your Heroku app from your local computer. The first thing you would want to do once you install the Heroku CLI is to log in using heroku login to log in using a browser or heroku login -i to log in using the command line

Deploy your app on Heroku

Now that we have a new Heroku app, go to your Heroku app’s deployment menu. Should look something like this below for my app which Heroku decided to name warm-atoll-45971 👏 👏 👏

You will notice that there are several deployment methods for you to choose from. I decided to go with Github instead of using Heroku’s integrated remote git option. Note, the very last line in the above screenshot which indicates that the Heroku app automatically deploys from the master branch of my repo. So, it is constantly monitoring my GitHub repo and making changes to the Heroku app as soon as it detects changes in my GitHub repo’s master branch. Next, push the simple script we created before to GitHub, and also make sure to have a requirements doc (pip freeze > requirements.txt) in your repo to let Heroku know what libraries to install to run your app remotely. ,

Also, create another file named Procfile in your repo that lists commands to be executed by your Heroku app on startup. You can find all the files I used in my repo. From the Heroku article on Procfiles: “A Procfile declares its process types on individual lines, each with the following format:

<process type>: <command>
  • <process type> is an alphanumeric name for your command, such as,webworker, urgentworker, clock, and so on.
  • <command> indicates the command that every dyno of the process type should execute on startup, such as rake jobs:work. In my Procfile I used: worker: python bot.py I tried using web: python bot.py"

I tried both worker: python bot.py and web: python bot.py. I only could get the worker option to work and not the web option in my first try.

Set up Heroku‘s environment

One last thing needs to be done before you have your app running. You need to let your Heroku app know about the API keys and access tokens for accessing your twitter app. The best way to do this is to use Heroku’s environment variables. Do not include your keys in your python script which is open to anyone who has access to your GitHub repo. To add environment variables to your Heroku app select the settings app and choose the Config Vars option as shown below for my app:

Now that we have all the pieces in place, start the Heroku app by runningheroku ps:scale worker=1 -a ‘your-heroku-app-name’. This will start running your python script within your GitHub repo and start retweeting based on the keyword searches you specified within that script. To stop the Heroku app anytime, run heroku ps:stop worker -a ‘your-heroku-app-name’.

That’s it! Check-out Part-2 of this post, where I use Tweepy and MongoDB to search and store tweets to a database. Full codebase can be found at https://github.com/kimoyerr/twitterbot

https://www.pinterest.com/pin/283797213988948094/

--

--

Krishna Yerramsetty

Data Scientist with over 7 years of experience. Too many things to learn and experience, too little time :)