What’s a Bot, Anyways?

Published in

Tales of Tech

5 min readFeb 12, 2019

I think I’ve had a lot of experience with bots. I remember talking to ELIZA on my neighbor’s Apple Macintosh. I’ve had services hit by DDoS attacks from botnets, bot vulnerability scanners, I’ve written a web crawler. I’ve programmed mobile robots at the Centre for Intelligent Machines at McGill. When I worked in advertising bots came in droves to load and click ads, and view ad videos. I won an innovation award at DataXu for creating a plan of action to deter bots at every level of the advertising ecosystem. At PCH my team was responsible for detecting, deterring, and eliminating bots and malicious advertising.

The word bot means so many things, and has had an evolving definition in popular culture accelerated by the United States political landscape of the last two-to-three years. Before that, in recent history, the word “bot” probably evoked imagery of car manufacturing robots, or perhaps poorly written internet chat bots. Sometimes these bots react to their environment, ala sensors, sometimes they react to our voices, or our text, sometimes they are made in human imitation.

Today the word bot has taken on a new meaning, is commonly associated with social media, and whether we’re talking about Facebook, Twitter, Youtube, or StumbleUpon, we’ve all seen the news. I hear the word bot at least once whenever I’m on a social media platform. I see articles about Russian bots, that give helpful (but inaccurate and absolute) rules for identifying Twitter bots. I also see articles detailing how Twitter ended up identifying real people as bots, which should tell you that any absolute rule, or even a consistent set of heuristics does not obtain, or perhaps even approach, 100% accuracy. The U.S. Congress is even getting involved, which should tell you just how widespread are these political propaganda bots. For expediency I’ll skip over the who, and the why, and go straight to the how. I’ll mostly focus on Twitter, where sources are notoriously difficult to cite and nuance is disgracefully difficult to achieve due to limitations of message size.

It also happens to be the social media platform that I think has the least issue with a “bubble,” meaning that the promoted content and accounts don’t have as much impact on viewership as on other platforms. This creates a perfect environment where a thought war can be easily waged, where the barriers between ideologues and their followers are at their thinnest.

A Bot is Born

Bots in social media are always composed of a front-end and a back-end, if you will. The latter is the underlying programming/automation. This represents the automated comments, likes, and follows on your Instagram posts and accounts. It represents automated retweets and likes in Twitter. It represents automated likes and reshares in Facebook. It represents automated thumbs up, thumbs down, and subscriptions in Youtube. You get the idea: if it’s something that only requires a button click (or two) it is practical and easy to fully automate.

The second piece to a bot is the human component. Someone needs to create an account for the bot, which means they need an email, and perhaps a phone number. This can be scripted (automated), but in more sophisticated examples it is at least partially manual. The more detail, the more unique, the more an online account appears to be backed by a living, breathing, human being. In the converse, accounts that are empty, have what appear to be randomly generated names, bios, etc., are easy to spot as bots, in particular when they draw attention to themselves through out-of-context automated messages and actions.

The Bot Uprising

OK, so bots aren’t going to be taking over the world any time soon, even if it sometimes feels that way. What I mean by uprising is the promotion of bot content by social media platforms, which in turn validates it as “good” content for the masses. Bot creators have developed simple formulae (varying by each platform) for doing this.

Create multiple (many) accounts, allow them to age (or take over older accounts).
Buy or use existing fake accounts to create a purportedly large following.
Follow the right crowd, ensure a healthy overlap with your target audience.
Set up automated retweets (or whatever) for key accounts, including all of your own accounts.
Post tons of regurgitated content.
Wait for a handful of your creations to rise to the top.

It works because they’re manufacturing the stuff that a social media platform looks for, namely authority and engagement.This resembles the days when people would “Google Bomb” Google search results. In fact, many of the ways ranking algorithms are defined in social media are imitations of Google search results.

Spot the Bot

Spotting a bot is easier said than done; heck, Twitter and Facebook struggle to distinguish bots from humans, let alone the U.S. media and the average Joe. You might be tempted to think of this as a cold war, as with many of these situations, but it’s not. Social media platforms have little incentive to clean up bots. They add to user numbers, they drive engagement (divisive as it may be), and in some cases they even view ads, which is the bread and butter of these platforms. So how can you spot a bot, or at least learn to suspect that an account is being used solely to manipulate the world around them? Can it be automated? These guys seem to think so, but I tested this against four accounts that display similar behaviors and only two of them were flagged, so I’m not convinced this is ready for prime-time (but you should try it, as false positives are far rarer than false negatives).

In my opinion that’s not what we should be doing as consumers of these services. It is not important if the account in question is fully automated, has a human behind the helm, or is an actual person; what is important is understanding if the intent of the account is to unfairly influence public opinion, to influence your opinion without merit. When you take a step back and look at it from this perspective it becomes much clearer.

These accounts have an absurd amount of content. I’m talking about hundreds of thousands of retweets, or likes, or shares, and it all revolves around a singular perspective.
The content these accounts post ranges from partial truths to outright lies. A lie containing a kernel of truth is the easiest to sell, of course.
They won’t cite reputable sources. Just the opposite, they will attempt to make reputable sources appear disreputable, untrustworthy.
They rarely if ever reply to their own content. The information is out there doing the intended work. As with point 2, they know the content won’t hold up to a fact-check, so why bother arguing about it?

This boils down to a simple fact that I’ve alluded to already: real people are not just one thing, representing and showing that one thing to the world. Real people are faceted, whole individuals.

What’s a Bot, Anyways?

A Bot is Born

The Bot Uprising

Spot the Bot

Written by Josh Begleiter