(Ab)Using Social Media APIs — Using Python for Privacy’s Sake

Nick Gottschlich
6 min readJul 28, 2019

--

This blog post is based off my talk at the Austin Python Meetup, which you can find on my website here — http://nickpgott.com/talks.

A few bad tweets from a decade ago can ruin your life. Don’t believe me? Ask Kevin Hart or James Gunn.

A few clues in your (supposedly anonymous) reddit profile can expose your real-life identity. Recently made some political comments on the site? You might be targeted.

Should it be this way? Twitter’s motto is “What’s happening?” not “What happened 10 years ago, that you deserve to be shamed for?”

So I built a tool that lets you clean up your reddit and twitter accounts. You can set up this tool to do this daily and automatically. This lets you ensure that only a slice of time exists for your social media accounts, one that you can control. You can find this tool at socialamnesia.com.

In this blog post I will run through some of the methods I used to build Social Amnesia. If you’re looking to write some code in Python that involves storing state between sessions, manipulating reddit/twitter APIs, or crafting a simple GUI, hopefully you’ll get some use from this blog post. The tools I used include:

  • Shelve
  • PRAW
  • Tweepy
  • Tkinter

Maintaining State in Python using Shelve

If you are going to build a program that deals with any sort of user configuration, you are going to want to save the state of the user’s session. In the case of Social Amnesia, I needed to save things like the profile the user is logged in as, the amount of time back the user wanted to save, or if the user wanted to save tweets that reached a certain amount of “likes”. To do this, I used a built-in Python package called Shelve.

Shelve allows you to save Python objects to a file on the machine, that file can then be read when the application is loaded to retrieve the previous configuration. Like so:

shelve.open takes care of loading the file for you (it even creates it if one doesn’t already exist!). Then it’s as simple as manipulating the shelved object the same way you would any other object in Python. The .sync method will ensure that everything matches up and downstream.

Note: You can’t do print(shelvedObject), as you end up getting back gibberish. You’ll need to loop through the file, like so:

for item in shelved_object:
print(item)

Shelve is not a full replacement for a backend database, and it is fully local to the machine (unless you decide to somehow read and upload the shelved file to the cloud somewhere). Keep this in mind if you’re saving anything that you want to make sure does not get lost!

Use PRAW — the Python Reddit API Wrapper

Now that we have the user’s configuration stored, we want to actually do something with it! This is where PRAW comes in. PRAW gives you a nice and easy way to access the reddit API programmatically. We can abuse this to mass delete reddit posts.

First, set up a reddit app on your account so you can access the API. The official reddit docs teach you how to do this here.

Now you should have a client id and client secret you can use. Set up your reddit access in python like so:

Notice how we saved the reddit user in reddit_state? If you set up reddit_state as a shelved dictionary, you can save the user’s login across sessions of your app.

Now let’s start causing some damage. This code will gather up reddit items and then delete them, if they pass certain parameters (which can be stored in that shelved state):

Boom! Those comments are gone. The reason why we edited them first is the reddit admins claim they only store them most recent version of any edited item.

Use Tweepy — twitter access with Python

On the twitter side of things, we will use Tweepy. Tweepy is much like PRAW, it is a simple wrapper around the twitter API for use in Python.

To use Tweepy, you will to set up an app in the twitter developer dashboard panel. Once that’s done, start up a twitter instance:

Note that it works mostly the same as reddit. Twitter is so kind as to deal with the whole 2FA stuff for you in their OAuthHandler, but reddit can be a bit more difficult (stay tuned for another blog post where I dig into that dumpster fire).

Time to gather up the user’s tweets!

Note the loop that we use here. We can only retrieve 200 tweets per API call, so in order to get more, we have to continually repeat the twitter API call until we run out of “new” tweets.

Then we delete the tweets in much the same way we handled the reddit items:

Using TKinter to create the GUI to tie it all together

So now you have processes that can store the user’s state across sessions, and ways to retrieve and then erase the user’s items on twitter and reddit. The last thing you need to do is give the user a GUI so that they can control these processes without having to deal with code.

For Social Amnesia I used a tool called TKinter. TKinter is a pretty old tool, but it will do for simple GUIs. Here’s an example of how to set it up using the .grid method of layout:

This code will create an app on your computer that looks like this:

Almost all of Social Amnesia was laid out in this way. Build frames, put components in them (buttons, input fields, etc.) lay them out in a grid format, and lay those frames out in another grid format. The final product can look something like this:

That’s pretty much the whole thing! Last thing to cover is how to make buttons actually do things. We will use “lambda” functions for this. A Python lambda function is a small anonymous function, in our case it allows you to bind an action to a button so it will only run when the button is pressed:

Now, that set max score button will set the max score in the reddit (shelved) state. If we did not use the lambda and instead just called the function, it would run as soon as the app started up instead of when the button is pressed.

This parameter can then be read in when PRAW goes through all of the user’s comments and decides what to delete (so if the user sets the max score to “100”, we won’t delete any comment that has over 100 upvotes).

Setting up a scheduler.

Lastly, I wanted to make sure that whoever uses Social Amnesia can run it in as consistently a manner as possible. It’s not useful if you delete your tweets after they’ve been posted on the front page of the New York Times. You need to be proactive.

I built a scheduler that will automatically run the app each day. It still asks the user for confirmation before deleting anything, but it ensure the user will keep deleting their old social media content even if they don’t open the app and manually run it each day.

Note the root.after This will call the function itself in a recursive manner every 60 seconds. It will do this all day, and if it confirms that it’s the right hour of day, and it hasn’t already run, it will run the deletion script.

And that’s it! Hopefully this helps you if you’re looking to make your own app to manipulate content on reddit/twitter using Python. Cheers!

--

--

Nick Gottschlich

I made https://github.com/Nick-Gottschlich/Social-Amnesia. I work for Procore, building construction management software to modernize the jobsite.