#DemDebates Data: Hashtag Hijacking, Antivax Disinfo, Cyborgs and Godmen
Less than a year out from the 2020 presidential election, campaign season is well underway in the United States — both offline and online. In an effort to understand social media discussion around the election, our team at the Digital Intelligence Lab at Institute for the Future monitored and analyzed discussion on Twitter surrounding both the October and September Democratic debates. This analysis was also featured in the Washington Post.
Our analysis revealed several interesting instances of astroturfing and disinformation — these include cyborg activity (accounts that mix human and bot activity) promoting certain candidates, hashtag hijacking to spread antivax disinformation, and a botnet alleging the innocence of a jailed “godman” in the state of Haryana, India.
This data set also revealed abuse of Twitter Ads Composer API to promote antivax disinformation during the October debate. While coordinated manipulation of social media to promote antivax disinformation is not a new phenomenon, this particular tactic appears to be a new form of platform abuse for disseminating disinformation.
The TLDR — Top-line Insights
- Antivax disinformation was promoted using the Twitter Ads API during the October debate using #DemDebates hashtags. These tweets were subsequently promoted hundreds of times by bots and hyperactive, suspicious users. Many of these tweets (166/361, or 46%) came from bot accounts. Forty-nine of the retweeting users (22%) average over 100 tweets per day. The fact that the original 52 antivax posts came through the Twitter Ads API and occurred within one hour suggests a possibility of automated dissemination of this disinformation.
- A coordinated botnet of 505 users surfed on the #DemDebates hashtag, attempting to allege the innocence of a jailed “godman” in Haryana, India. Rampal Singh, referred to by followers as “Sant Rampal Ji,” is a former engineer turned “godman” (guru) in India who was jailed in 2014 for contempt of court and several alleged murders. A coordinated botnet used hashtags and account names of participants relevant to the debate to promote websites linked to Singh’s ashram. These websites praise Singh’s holiness and allege his innocence — the botnet regularly tries to co-opt trending topics to promote Singh and his websites. Eleven users in this set average over 500 tweets per day; one account, @krishna25383438, posts an average of 1,429 tweets per day.
- The most highly cited Facebook URL in the October data was a conspiratorial post from an “alternative” media outlet. This post promotes several disinformation theories, including smearing Joe Biden and Victoria Nuland, a former senior State Department employee and frequent target of Russian disinformation. The post itself also appears to have been artificially promoted on Facebook — with over 200 shares from only 3 users.
- The average percentage of bot users tweeting any given hashtag during the October debates was 10.7%. On average, 10.7% of users tweeting any hashtag during the October debates show signs of automation. This was an average of bot activity from the 31,724 hashtags that were collected in our October data set.
Methodology and Data Set
Our lab streamed live Twitter data during the debates in both September and October. This included a 48-hour stream of the Sept. 12th debate and a 72-hour stream during the Oct. 15th debates. For full details on our methodology and data sets see Methodology and Data Set Appendix at the end of the post.
Vaccines Don’t Work — Buy a Shirt & Hat Today!
One of the more nefarious campaigns in the data was a hashtag hijacking campaign that promoted anti-vaccination disinformation using hashtags relating to the October debates. The core of these tweets came from a single user, @45HammerTime, who promoted a website that sells antivax merchandise during the debate. The goal of the campaign appeared to be earning profit by attracting attention from users watching the #DemDebates hashtag.
During peak hours of the debate on October 15th, this user posted antivax disinformation alongside Democratic debate hashtags. Using the Twitter Ads Composer API, this user posted 52 antivax tweets in one hour — they were subsequently retweeted hundreds of times by bot accounts¹ and hyperactive users² who average over 100 tweets per day.
Researchers including Erin Gallagher, Jillian C. York, and Brian Krebs have extensively documented instances of hashtag hijacking in the past. This term refers to repurposing a political hashtag with nefarious intent. The goal of these campaigns is often to undermine the original use of the hashtag. In Mexico and Syria, protest hashtags have been flooded with irrelevant content to undermine their use as an organizing tool. Similarly, trolls have previously hijacked hashtags for human rights conferences to flood them with harassment and disinformation. Similarly, previous research has also similarly highlighted abuse of the Twitter Ads interface to spread unlabeled political ads during the EU Parliamentary elections in 2019.
Twitter first took action against antivax disinformation on its platform in 2019, following other tech platforms that had done the same. While the approach aims to direct users to scientific studies showing the proven benefit of vaccines, it stops short of removing all antivax disinformation from the platform.
Holy Hijacking — Indian botnet co-opts trending hashtags to whitewash jailed religious leader
Other users coordinated attempts to hijack the #DemDebates4 hashtag to promote a jailed religious leader from India. A large botnet of 505 users spread disinformation completely unrelated to the US debates in praise of a jailed and discredited religious leader, Rampal Singh.
Singh is former engineer who labeled himself a “godman” and gained a large following in India’s northern state of Haryana in 1999. He subsequently founded the Satlok Ashram, where he used tricks to appear godlike to his followers, and several deaths occurred. BBC has extensively documented his history — including his conviction on multiple charges of murder and contempt of court in 2014.
This botnet regularly co-opts trending topics to promote Singh and his websites. Eleven users in this set average over 500 tweets per day; one account, @krishna25383438, posts an average of over 1,400 tweets per day. Several groups of accounts in this set also posted identical tweets using the #DemDebate and #DemDebate4 hashtags. These posts are identically worded, original tweets (not retweets) coming from different accounts (1, 2, 3, 4 ;1, 2). This is also a tactic that DigIntel recently observed the Chinese government using.
Previous analyses from Graphika indicate that the botnet is a densely interconnected and isolated network in the Indian Twittersphere that nearly exclusively tweets religious praise of Rampal Singh. Tweets from DigIntel’s October debate data set corroborate these previous analyses. These fake accounts frequently post the links jaguratrampalji.com and supremegod.org, two websites promoting Singh’s ashram. According to the BBC, this ashram dismisses over 40 cases of murder and contempt of court against Singh as “false and fabricated”.
While online disinformation is often thought of as the province of nation-state actors, we can increasingly expect to see their use by private citizens for the purposes of PR or whitewashing past crimes, as we see here with the Rampal Singh botnet. This set of artificial users illustrates a relatively new disinformation strategy — private actors strategically deploying messaging campaigns online as a form of reputation management. Targeted hacking of journalists investigating private individuals in the public interest is also an outgrowth of the same impulse. Ahead of breaking Pulitzer prize-winning stories detailing allegations of sexual assault against Harvery Weinstein, The New Yorker’s Ronan Farrow was notably hacked and stalked by Israeli firm Black Cube.
#YangGang Cyborg Activity
While bots are an astroturfing tactic that has been covered extensively in past years, they are not the only form of automated agent online. Cyborgs are accounts that are partially automated — sometimes run by a human and sometimes by a computer program, or even run by both at the same time. As tech platforms grow more aggressive in removing bots from their platform, bot builders and operators refine evasion tactics to avoid suspension. Cyborg accounts are one such evasion tactic. One user in our dataset illustrated this particularly well.
@McYangin, a recently created anonymous account that mostly promotes Democratic presidential candidate Andrew Yang, showed cyborg activity during the month of September. This user automated retweets of text mentioning Andrew Yang or the hashtag #YangGang. Most retweets from this account in September come from a custom tweet client named “Yang RT Bot”.
A tweet client (also referred to as tweet “source”) is simply the place a user sends a Tweet from — an iPhone Twitter app, Android Twitter app, or web browser are all possible places from which a user can post a tweet. A user can also use a third-party client, such as TweetDeck, to send tweets. Yet another option is for a user to write their own software to post tweets for them — this software can be customized to only tweet under certain conditions, such as scanning Twitter for mentions of Andrew Yang and retweeting any that are found.
This is indeed exactly what was seen in the case of @McYangin, a user who was highly active during the September debates. Original tweets from this user tended to come from a browser, as indicated by the “Twitter Web App” client in the metadata for these tweets. This most likely indicates³ that the owner of the account was composing and sending tweets from a web browser on occasion.
While the idea of automating candidate retweets may seem innocuous, the traffic a single client can generate can have an overall effect on driving conversation or trends. While the YangGang RT Bot client appears to only have been active in September, this bot retweeted items relating to Andrew Yang over 2,600 times that month. These tweets have generated over 15,000 impressions⁴. Only months away from the beginning of democratic primaries, this impact of this amount of traffic and impressions in non-negligible.
Transparent Political Bots and Hyperactivity
Not all bots in our data set were attempting to imitate humans. Many users in our set were “transparent” bots — accounts that overtly state that they are automated. Most of these accounts use custom tweet clients (i.e. bot software) to automate tweets relating to a particular candidate. One retweets mentions of any democratic candidate.
@bot420political — This unsophisticated bot tweets from its own custom client called “PoliticalBot420” and calls for Biden to drop out of the presidential race. #kag2020, #americafirst, #dropoutbiden are among its most frequent hashtags. It was only active in September and posted 21 tweets during the debate thatare nearly all identical — “#DropOutJoe #DropOutBiden #NeverBiden #Biden2020”. The account posted 298 tweets in the month of September, several occurring the same second.
Many of these transparent bots also figured among the most hyperactive users in our October dataset. Previous researchers, including researchers at the Oxford Internet Institute and disinformation expert Ben Nimmo, have noted that accounts averaging over 50 tweets per day are often automated or artificial users. In this data set, we’ve set a higher threshold for suspicious activity — defining hyperactivity as an account that tweets over 100 tweets during the streaming period and averages over 100 tweets per day over the account’s lifetime. There are 301 accounts in our data set meeting these criteria. Two of these accounts, @ShirleyRinguet5 and @alllibertynews, even average over 1,000 tweets per day.
Conclusions and Takeaways
Conversation around the democratic debates on Twitter continues to show signs of astroturfing. Cyborgs and bots were both observed in the data DigIntel collected around the September and October debates. Disinformation was also present. Twitter’s own ads client was abused to spread antivax disinformation in tweets during the October democratic debates, which were retweeted hundreds of times by suspicious, hyperactive accounts. Bots also hijacked the Democratic debates hashtag to allege the dubious innocence of a repudiated religious leader in India. This botnet also highlights that disinformation may be increasingly deployed in the future by private actors as a form of reputation management.
Perhaps most importantly, the presence of bots, both transparent and not, promoting candidates on Twitter drives home the increasingly democratized and normalized nature of political automation. Until regulation states otherwise, we can expect to see more astroturfing and artificial promotion of candidates online — often from real users who want to support a candidate they are excited about. This behavior may seem innocuous, but its danger lies in its ability to “manufacture consensus” and influence other users’ perception of organic discussion of politics. This is, for the time being, simply politics as usual. What is new in 2020 is the fact that grassroots supporters themselves may be driving this traffic.
Methodology and Data Set Appendix
Third Democratic Debate, September 12 — We streamed tweets relevant to the third Democratic debate in September for 48 hours — from midnight of Sept 12 to 11:59 on Sept 13. This data set consisted of 3,008,958 tweets from 695,094 users. Our stream collected tweets that contained any of the hashtags or mentions in the following query:
Fourth Democratic Debate, October 15th — We streamed data relating to the fourth Democratic debate for 72 hours, from midnight on October 14th to 11:59 pm on October 16th. The dataset consisted of 2,855,505 tweets from 667,950 users. The query we streamed included mentions of the handles of debate participants, hashtags of participants names, and hashtags relating to the debate itself:
#DemDebate,#DemDebate4,@TulsiGabbard, @TulsiPress, @TomSteyer,@SenSanders,@BernieSanders,@JoeBiden,@SenWarren,@ewarren,@KamalaHarris,@SenKamalaHarris,@PeteButtigieg,@AndrewYang,@CoryBooker,@SenBooker,@BetoORourke,@RepBetoORourke,@JulianCastro,#BernieSanders,#JoeBiden,#ElizabethWarren,#TulsiGabbard,#KamalaHarris,#PeteButtigieg,#AndrewYang,#CoryBooker,#BetoO’Rourke,#JulianCastro,#AmyKlobuchar,#Tulsi,#TomSteyer
 We’ve used the academic convention of considering “bots” to be those accounts which are classified with a greater than 50% chance of being a bot in Botometer, an open-source bot classification tool.
 We’ve defined hyperactive accounts as those averaging over 100 tweets per day since their creation date.
 While bots can be programmed to write and compose tweets from a browser, this is a much rarer occurrence on Twitter than bots that operate through the API. It takes significantly more technical skill to deploy a bot that operates outside the API. This and other investigative data led us to conclude that original tweets from @McYangin were likely written by a human.
 To calculate a tweet’s impressions, we sum its number of likes and retweets.