Some fun with the Twitter API, some light functional programming riffing, and thoughts on toolbuilding

I’m an active user of Twitter, but often my reach (rants) exceeds my grasp (140 characters at a time), and I end up trying to figure out how to thread while trying to write a coherent set of tweets, and decided it’d be a fun project to just take an existing block of text, and use Twitter’s API to thread that longer-than-140 characters rant automatically without the psychological barrier of knowing I need to start a new tweet.

It’s a pretty straightforward script (if it seems unoptimized, why that is partially intentional will become clear in a moment):

However, there’s two reasons I’m writing this:

  1. I know a lot of people who would love to just be able to write something, and then have Twitter do the heavy lifting.
  2. It was fun to write (I’ve been developing for more than decade, but only last year have had I had to use Python in my job and I’ve become a very big fan, and bringing some functional principles to how I program, it turns out, was very easy to do in Python).

So, the script, itself, takes in either a quoted string from stdin:

python rant.py -s "Always remember in the jungle there’s a lot of they in there, after you overcome they, you will make it to paradise. We don’t see them, we will never see them. The key is to enjoy life, because they don’t want you to enjoy life. I promise you, they don’t want you to jetski, they don’t want you to smile. In life there will be road blocks but we will over come it. Every chance I get, I water the plants, Lion!"

or, from a file path, and it slices up the text, and as each tweet gets posted, the API response includes the URL of the tweet you just posted, which is where the new reply gets added, thus creating the thread format we’re all so fond of. This is pretty simple, but true to some basic functional principles, we’re breaking this seemingly single task into two tasks.

You have the actual function that posts the tweet:

def tweet(segment_text,reply=""):
data = { 'status' : '%s' % (segment_text)}
tweet_data = urllib.urlencode(data)
if reply == "":
segment = oauth_req("https://api.twitter.com/1.1/statuses/update.json?%s" % (tweet_data), ACCESS_KEY, ACCESS_SECRET, segment_text)
else:
rep_raw = { 'in_reply_to_status_id' : '%s' % (reply) }
rep_data = urllib.urlencode(rep_raw)
segment = oauth_req("https://api.twitter.com/1.1/statuses/update.json?%s&%s" % (tweet_data, rep_data), ACCESS_KEY, ACCESS_SECRET, segment_text)
return json.loads(segment)['id']

and the function that preps that function by calling the split function to create the 140 character chunks (basically a manifest of your tweets), and handles the response to queue up which tweet the next will respond to:

def itr_text(input):
username = str(accountInfo())
tweet_ind = 0
print "Tweeting: %s" % (input)
if len(input) >= 140:
segments = split(input, 140)
for seg in segments:
if len(seg) >= 140:
segment_text = seg
if tweet_ind == 0:
tweet_id = tweet(segment_text)
str_id = tweet_id
print "Thread: https://twitter.com/%s/status/%s" % (username, str_id)
else:
tweet_id = tweet(segment_text, str_id)
str_id = tweet_id
print "Replying to thread, reply: https://twitter.com/%s/status/%s" % (username, str_id)
else:
segment_text = seg
tweet_id = tweet(segment_text, str_id)
str_id = tweet_id
print "Final reply to thread: https://twitter.com/%s/status/%s" % (username, str_id)
tweet_ind = tweet_ind + 1
else:
return "No need. Entire length of tweet less than 140 characters."

The question I’d ask myself at this point is that the tweet function seems simple, pretty stateless, it just executes on input data, and exits, while the second is a little more involved, and immediately, a few things stick out as a way to make it more functional, less work for the itr_text function to do (rather than evaluating the output of smaller functions as expressions being input into it).

Because the itr_text function is actually performing more than one task (it’s preparing tweets, but also…tweeting them, so it’s working on its own data before operating it, rather than just ingesting it as an expression, making the behavior less predictable), and is far less abstracted than it could be, let’s break out these tasks into separate functions:

def export(username,tweet_ind,segments):
for seg in segments:
if len(seg) >= 140:
segment_text = seg
if tweet_ind == 0:
tweet_id = tweet(segment_text)
str_id = tweet_id
print "Thread: https://twitter.com/%s/status/%s" % (username, str_id)
else:
tweet_id = tweet(segment_text, str_id)
str_id = tweet_id
print "Replying to thread, reply: https://twitter.com/%s/status/%s" % (username, str_id)
else:
segment_text = seg
tweet_id = tweet(segment_text, str_id)
str_id = tweet_id
print "Final reply to thread: https://twitter.com/%s/status/%s" % (username, str_id)
tweet_ind = tweet_ind + 1
def itr_text(input):
username = str(accountInfo())
tweet_ind = 0
print "Tweeting: %s" % (input)
if len(input) >= 140:
segments = split(input, 140)
export(username,tweet_ind,segments)
else:
return "No need. Entire length of tweet less than 140 characters."

Now, the export function actually does the tweeting, and ingests the data provided by the itr_text function, and neither has to (for the most part) produce the data that the other is working on, meaning that future invocations can be made without having to script a lot of this over again.

This script is, by no means, a perfect (or even particularly good; you’ll notice there’s no error handling, nothing to manage exceptions or retries, even) example of the paradigm, but tasks like these are, potentially, very good use cases for where the paradigm can really shine, especially for beginners seeking to solve practical, quotidian problems in a more abstracted, less involved way that yields predictable results; value goes in, expressed result comes out.

An example of a better practice might be a function like this:

def ssum(S):
pw_l = list_constructor(S)
longest_l = len(sorted(pw_l, key=len, reverse=True)[0])
if not pw_l:
return False
else:
if -1 <= longest_len <= 200:
return longest_len
else:
return False

The `ssum` function, for example, ingests one piece of data that, itself, it does not operate on (acting as a main program, it hands it off to list_constructor and longest_l ; basically a function to create a list from an input (for example, a list of characters from a string, with each string delimited by a symbol into a new index), and then a function to sort the list items and return only the one with the longest length. The ssum function, then, only performs one task containing two questions: Is the list absent all qualifying strings (as defined by the constructor function? If so: Is the value of longest_len within the range of 1–200, if not, it returns a default value.

The point of the above is that these are all more predictable, more abstracted operations than if I simply wrote a script to perform these steps, and I know exactly what piece is doing what.

With that in mind, I want to end this with a few thoughts about tool building, no matter how critical (or in the case of this tool, entirely trivial) its use may be:

The problem with automating solutions to problems, is that the impulse is always there to solve the problem immediately ahead of you, and not always consider the longest view available to you, and that’s often how tools can become obstacles to building them effectively; sometimes a poor, but effective practice becomes quotidian, and things go wrong once, demonstrating that inherent flaw in that line of thinking. I’ve written a little bit about this before:

and my point is mostly that, when you build tools for others, or even when you build for yourself, considering the work an evolving one, and one that requires active maintenance (which it should as long as it remains in active use) is imperative if you want the tool to improve and scale with your team and as a solution.

You learn from mistakes how to improve it (or if it needs to remain in use at all!) and what safeguards may prove useful (or mission critical) in the future. This is one, excellent way to prevent fallout from things that are inevitable, but easily challenged like knowledge hoarding; a perfect example of how something well-intentioned (a Subject Matter Expert-based system, for example) can get out of control and inaccessible (literally only one person knowing how something works, and the tools just don’t speak for themselves).