Truth in an age of truthiness: when bot-fueled PsyOps meet internet spam

When artificial amplification becomes itself artificially amplified through the presence of spammers and opportunists, the cost to power for those who game the system in just the right way can be incredibly small.

Published in

Data for Democracy

13 min readMay 17, 2018

I’m a recently former academic. Last spring, as evidence of possible Russian attempts to influence the US presidential election was starting to come out, one of my digital studies students asked me a great question. She asked me if I really thought all the news about Russia, social media bots, propaganda, etc. was connected. After all, one big Russian conspiracy to steal the US election — that’s pretty far-fetched.

My answer to her was this:

No, I don’t think there’s a single conspiracy behind it all. Rather, there are several conspiracies that, for a time at least, had overlapping goals and tactics.

What we were starting to see then, and what we’ve seen even more clearly since, is that whether or not there was formal collusion between Russian operatives and the Trump campaign or his allies, there were foreign attempts to influence the US election results. There were also new, digital campaign strategies that took advantage of the affordances and limitations of popular social media platforms — to varying degrees of ethical acceptability. There were digitally inclined true believers, including both moderates and extremists, who sought to persuade those in their social networks to vote a certain way.

And there were opportunists, grifters really, who took advantage of the money-making possibilities in what was likely the most click-baity American election to date.

The biggest problem here is not simply that people were making money off our insatiable appetite for Trump-centered or Clinton-centered conspiracy theories. It’s not even that there were foreign influence operations polluting our social platforms. The biggest problem, as I see it, is that when there are multiple operations on the same platforms, with similar goals, and similar tactics, the math behind the algorithmic news feeds causes each of these operations to magnify the effects of the others.

Think of junk mail or email spam. Five spamming or advertising operations means five times the spam messages. But in an algorithmically determined information stream, where the popularity of certain content boosts the frequency with which similar content appears in people’s news feeds, the effects of five simultaneous information operations can be anywhere from nil to several orders of magnitude more disinformation in our feeds. And we know that in 2016, the effects were not nil. Even though traditional news media consumption went up in 2016, a number of disinformation sources with significantly lower budgets and significantly lower prior audiences (not to mention significantly lower standards of truth) frequently outperformed the mainstream outlets in the latter days of the presidential election. This is, in part, because there were multiple simultaneous operations, each of which knew better than the Washington Post how to game the system, and each of which magnified the effects of the others when they were operating in the same algorithmic space.

Almost all of the operations I’ve mentioned so far have been the subject of substantial investigation since 2016 — Bannon and Breitbart, Russia’s Internet Research Agency, bot-driven American extremists, and modern, digital electioneering. But one has received precious little attention: the ideology-agnostic opportunists. These aren’t skinheads, or terrorists, or Russian spies. There were no secret meetings between them and any campaign officials. They weren’t meme masters organizing on 4chan and the seedier subreddits. They often don’t even work in groups, as far as we know. These are Macedonian teenagers, Colombian loners, anyone, really, who is looking to make a quick buck tricking users into clicking on the wrong link, distributing good-old-fashioned malware, or maybe mining user data via a disposable “news” site.

But this isn’t traditional clicker, beware! internet spam. They generate those clicks by capitalizing on polarizing web content, and their methods amplify the presence of that content. And most importantly, by amplifying that polarizing content, much of which is disinformation and misinformation, to garner clicks, they have a negative effect on our information landscape whether anyone clicks on the links or not. And while platforms like Twitter have got better in the past year at detecting and suspending these kinds of accounts, the nature of their operations are such that by the time they are detected and deleted, the damage has often already been done.

Over the next few minutes, I’m going to walk you through a few examples of how this works, and explain why this is such an important issue for all of us who are concerned about information integrity.

The first example comes from last spring. My research colleague, Bill Fitzgerald, and I compiled a list of the domains most distinctive of right-leaning and left-leaning hashtags on Twitter. In the process, we discovered a number of clearly fake “news” domains polluting hashtags like #resist, #trumprussia, #blacklivesmatter, #pizzagate, #americafirst, and #maga. One of the top fake/malware sites in this tweet corpus is theatlantic.ga. This domain appeared 2651 times in our corpus of about 600,000 tweets, more than TruthFeed (2384 times), Breitbart (2036 times), or the New York Times (1816 times). (Note that theatlantic.ga, TruthFeed, and Breitbart all out-performed the New York Times on these hashtags.) In fact, only Twitter (89,218 times), YouTube (12,268 times), and a couple link shorteners appeared more often than theatlantic.ga.

What is theatlantic.ga? The domain has since expired. But if you visited the domain directly a few months ago, it redirected you to adf.ly, a site whose motto is “Get paid to share your links on the Internet!” adf.ly (whose logo is a bee?!) offers services that help people sell advertising on their sites, “Pay for real visitors on your website,” and track aspects of the identity and behavior of those visitors.

They also offer an API (application programming interface) that makes it easy to automate the process of generating and sharing links. The way it seems to work is similar to other link shorteners (like bit.ly), but instead of just making a long URL short and tracking clicks, it embeds ads along the way. Users who click on an adf.ly link see a brief ad on their way to the page they intended to go to, without seeing any additional ads on the actual page.

Businesses like adf.ly mean that it’s easy to advertise on your site with minimal technical knowledge. But it’s also easy to advertise on other people’s sites by sharing links to them via social media, and thus it’s possible to make money off of ads on other people’s sites without even having your own website. In fact, one of the user testimonials on their site says, “If you want make easy money, AdF.ly is the best website.” All you need is a Twitter account (or a bot, or an army of bots), and if you have an audience for those Twitter bots, you have people who will click on the links you provide, which will in turn generate “advertising” revenue for you, the bot owner. (Not, of course, the site owner.)

This sounds like a pretty vanilla internet problem — people using technology to make money off of other people’s content without putting the work in themselves. (And to be fair, adf.ly can be used to make money off of your own content. Especially if your content is hosted on a free platform that does not allow external ads.) The problem is the way services like this are used by some. In our dataset, we saw many tweets like this one.

On the left you can see the fake tweet, and on the right its source. What happened here is that The Federalist Papers, a mainstream if heavily partisan site, tweeted a link to their long-form article, along with a picture and a provocative title to encourage clicks. The adnet scraped that content and republished it from this “Hope to Trump” account (and a number of others), keeping the provocative title and the image, but substituting their own link — built from adf.ly, to generate a few cents of advertising income.

Note the hashtag. The Federalist Papers has its own follower base, but Hope to Trump did not. In fact, Hope to Trump turned out to be a rather short-lived bot, suspended by Twitter within a week of my discovering it. Rather than do the work of building up a social-media following, these ad networks rely on hashtags to get their bot-driven content in front of users. In our dataset, we saw a number of extremist and hate-based hashtags being used, along with the more typically partisan ones, to get their ads in front of users without spending money on sponsoring tweets. This allows them to avoid the work of designing and building a website, paying for domain registration and hosting, editing their marketing materials, and even paying for ad placement. There is literally no overhead cost to this form of advertising, only a person’s time to put it together, which is likely dominated by finding new content sources and cycling in new bots when existing ones get blocked by Twitter. And since all the creative work is done by the people whose content they steal, most of what they have to do can be easily automated. It really is “a way to make easy money”.

But it’s not the money-making scheme that impacts our information environment. Once users click the link, most of these schemes take them out of the political “news” arena. When it comes to information integrity, what matters is the content of the tweets. This pair of tweets looks a lot like the kind of coordination we see in disinformation bot networks — groups of automated Twitter accounts sharing the same content simultaneously to different follower groups or hashtags.

This artificially amplified content can be incredibly dangerous, particularly when that content is polarizing, heavily biased, or deliberately misleading. Both Mike Caulfield and I have written about the effects of casual scrolling on social media. When we see the same claim repeated often enough, especially when our guard is down, we slowly become more predisposed to believe it. In an age of information abundance and constantly breaking news, “truthy lies” function almost identically to truth. The sum of all this is that the most polarizing claims are often the most repeated by the ad-bots, and as readers see them in aggregate over time, they become more plausible in our unconscious mind. Further, once a friend, relative, or colleague we trust shares that same claim ― perhaps from the original site the spammers stole content from ― we will be that much more likely to believe the claim. Thus, a cheap money-making scheme gives the more polarizing sites more psychological power, and that lays the foundation for more purposeful ― and successful ― propaganda campaigns.

Let’s fast-forward to this spring. After the shooting in Parkland, Florida, students organized a March for Our Lives in Washington, D.C., and elsewhere. I collected over 3 million tweets related to the March for Our Lives from the two weeks preceding the march to two days after the march.

During that time, the two top accounts were RaulOrozco (2766 MFOL tweets) and ClassicDeepCuts (1792 MFOL tweets).

The RaulOrozco account (represented by a Debbie Gibson headshot from the 80s) is still active, with almost 100,000 tweets to date. Most of these tweets are promoting YouTube videos of animals up for adoption, with a long list of hashtags in each tweet to attempt to gain visibility. Here are some examples.

ClassicDeepCuts takes a different tack.

Instead of repeating the same hashtags with different content, the account’s 335,000 tweets are dominated by a single image, with each tweet varying, to some extent, the included hashtags.

While the account has some appearance of an activist account, the account name and the website linked from the bio suggest that the account is using activist theming to attract visitors to classicdeepcuts.com, in the hopes of generating revenue from music fans.

I also uncovered a network of accounts in the March for Our Lives tweets, claiming to be from Bangladesh, that used the march (and many other events before and since) to push their graphic design business. While some of these tweets seem targeted at protesters perhaps in need of quality signage for the rally, most of these tweets are nothing more than hashtag scattershot: attempts to use trending topics to gain free advertising for an online business.

Why does any of this matter? As long as we’ve had networked communication — social media, email, telephones, even postal mail — we’ve had unwanted, unsolicited messaging. Surely these accounts are fooling few people, and most of them aren’t even spreading disinformation.

But there’s something different about social-media spam. Just as stolen and reproduced content can boost the topics embedded in the stolen content, unique content with a co-opted hashtag or trending phrase can boost that hashtag or phrase algorithmically. While Twitter has taken steps to prevent a trend being started by bots alone, if a topic or hashtag begins to rise in popularity organically, bot-driven amplification does run the risk of artificially boosting its popularity. And once it crosses the “trending” threshold, its reach is instantly magnified many times over. Further, if artificial amplification continues after it reaches “trending” status, it will likely stay on the trend list longer, again increasing the audience.

Perhaps more important, though, if a campaign is already being boosted artificially, the presence of spam bots in and around that campaign will not only amplify the content of that campaign, it will amplify the effects of the other amplifiers. This not only pollutes public discourse, it also makes it easier under the right circumstances for propaganda to spread.

Harald D. Lasswell wrote that the function of propaganda is to reduce the material cost of power. On a social-media platform, that cost-reduction comes in many forms. By their very existence, the platforms already reduce both the labor and the capital required to access both information and an audience. Automated accounts further reduce the cost of power, for those who know how to game the algorithm and evade detection long enough to carry out a campaign.

But when artificial amplification becomes itself artificially amplified through the presence of spammers and opportunists, the cost to power for those who game the system in just the right way can be incredibly small. For those of us studying the digital information landscape, whether we seek to understand it or to effect positive change in it, it is essential that we understand all of the ways in which messages can be amplified — and the effects those methods can have on each other when they overlap.

Photo by “My Life Through A Lens” / Unsplash

The promise of the internet, and Web 2.0, and social media, was democratization — a leveling of the playing field that gave everyone equal access to an audience. But information over-abundance has altered the way we engage with information on these platforms. We’ve moved from an information economy to an attention economy. Those who thrive on social media in 2018 are the ones who can manage attention — their own, and that of others. And for those who really know how to game that system, the democratic, many-to-many landscape can look a lot like one-to-many broadcast media. And once that happens, previous generations’ disinformation methods can transition to the new media with relative ease.

If there is a grand Russian conspiracy, I think that’s it. By gaming the system in just the right way, social media can be turned into something that looks a lot like a one-to-many broadcast platform, and can be manipulated in similar ways. Except this one has a global reach, and minimal FCC oversight. But again, it’s not about the Russians. We live in an information landscape where multiple forces are warring for our attention, where the platforms are designed to foster addictive consumption, and where casual information engagement, not critical thought, is the norm.

If we want to reclaim the integrity of this space, and of public discourse in general, we need to start by acknowledging the forces at work, as well as the affordances and limitations of the platforms we use. Only then can we begin to grasp the fullness of the propaganda problem, and the profitability of disinformation, misinformation, and good old fashioned spam. Whether our goal is better platforms, better policies, or greater awareness of just what conspiracies we are (and aren’t) dealing with, that’s where it begins.

Originally delivered at the American Association of Public Opinion Research 2018 Annual Meeting and published at pushpullfork.com on May 17, 2018.

Truth in an age of truthiness: when bot-fueled PsyOps meet internet spam

When artificial amplification becomes itself artificially amplified through the presence of spammers and opportunists, the cost to power for those who game the system in just the right way can be incredibly small.

Written by Kris Shaffer