Can Elections Be — Bot?
🗣The megaphonic power of social networks
Towards the end of the recent election, there seemed to be “bots” everywhere; I decided to take an in-depth look at these activities on Twitter during the last week of the 2016 election with IBM’s Watson Analytics. As with my other work, I took a network analysis versus a content-based approach to get a sense of the structural impact these prolific accounts have on the amplification of certain voices on Twitter.
This analysis is based on the “pro-Trump” hashtag list from Philip Howard’s political bots work at the Oxford Internet Institute. I split the hashtags into two groups of ten due to Watson’s collection limits, and then filtered the dataset to include only the top 20 percent of accounts by highest number of total tweets. I then used this data, consisting of close to 50,000 tweets, to generate the following hashtag “activity networks.” Most of the data in this study is from the 1st to 8th of November 2016, which corresponds to the final week of the 2016 election.
Group 1: #benghazi, #AmericaFirst, #CrookedHillary, #DrainTheSwamp, #lockherup, #PodestaEmails, #projectveritas, #riggedelection, #wakeupamerica, #maga3x
This exploratory network analysis is meant to:
- Find the most active suspected automated and “semi-automated” accounts across “pro-Trump” political hashtags the last week of the election; and
- Get a sense of how these political “bot-like” accounts engage, as well as visualize how they might be connected to certain topics.
This data represents the bulk of tweets for the less active “Pro-Trump” hashtags such as #wakeupamerica and#projectveritas, and a fair amount of tweets for the more popular hashtags such as #Trump and #draintheswamp. Sample size was not a limiting factor here, because the worst offenders tweet so often that they are likely be found in almost any sample during the last week of the election. No account in this analysis had less than 192,000 tweets at the time of data collection (12 Dec 2016), and the highest volume account had close to 2.5 million tweets.
The breakdown in the tree chart above shows the top 20% of individual accounts by total tweets for each hashtag. For the first group, #draintheswap, #podestaemails, and #crookedhillary (circled) appear to be the hashtags with the highest number of potential “automated” or “semi-automated” accounts, though #wakeupamerica appears to have a prolific poster in Cary888888 (the large blue block in the graph above).
The following hashtag “activity networks” involve accounts with hundreds of thousands of tweets posting during the last few days of the 2016 election — when accounts post tens of thousands of tweets over the course of a few weeks, it’s hard to understand how they could not be automated in some form or another. After my first round of analysis, I ran the accounts in this study through Truthy’s BotorNot, and also collected data related to the frequency and source of their posts. There is strong evidence to suggest that the majority of accounts in this study are likely to be at least partially automated.
Each instance of an account name in the hashtag “co-activity” graphs (see example below) represents a single tweet; the number displayed to the right of each user handle is the total number of tweets for that account at the time the tweet was posted.
From the evidence shown above, these “highly active” accounts appear to differ in posting strategy: some are active across a variety of popular “pro-Trump” hashtags (e.g., gerfingerpoken and UThornsrawk), while others appear to be more concentrated on individual hashtags (e.g., paparcura+#Trump). The following graphs show isolated views for the most “active” accounts in the first group of “pro-Trump” hashtags: UThornsrawk, JVER1, and ImmoralReport.
Interestingly, I found that UTHornsRawk had been singled out in a 2015 Guardian article about a group of Twitter accounts that spread rumors and photoshopped “twin tower” logos faulting the Clinton administation’s foreign policies for the 9/11 WTC attacks:
The UTHornsrawk account tweets most of the day and night, averaging around 400–500 tweets per day, six days a week:
Other prolific accounts in the first group include JVER1 and ImmoralReport:
Group 2: #MAGA, #MakeAmericaGreatAgain, #maga3x, #Trump, #Trump2016, #TrumpPence16, #Trumptrain, #VoterFraud, #votetrump, and #tcot
The breakdown in the tree chart above shows the top 20% of individual accounts by total tweets for each hashtag. For the second group, #maga, #tcot, and #trump (circled) appear to be the hashtags with the highest number of potential “automated” or “semi-automated” accounts. Again, three hashtags appear to have a prolific poster in Cary888888 (the large blue blocks in the graph above).
Group 2: Most visible accounts across hashtags by total tweet volume
The amrightnow account is another interesting case of potential automation, as the account handle was included in an original (non-retweet) post by Trump in November 2015.
Even more so than UTHornsrawk in the first group, the amrightnow account appears to post non-stop: 24 hours a day, seven days a week.
Other highly active “bot-like” accounts in the second hashtag group include 1othAmendment and paparcura.
IBM Watson Analytics for Social Media
I ran the two Twitter accounts noted above through IBM’s specialty social analytics “listening” tool (not the standard Watson) across a number of election topics. While you can go back in time with this program, each query is limited to a total of 25,000 archived records, called total documents. This application is typically used for PR and brand awareness, but is valuable to this study as it can pinpoint themes and determine “share of voice” for actors in the news within a specific date range.
The graph above shows the thematic breakdown of topics of posts originating from two of the highest activity Twitter accounts for this study.
The themes suggest that the accounts seem to be somewhat focused content-wise on attacking Hillary, promoting Trump, pushing for military involvement in Syria, and getting out the vote. Last but not least, here are the top co-occuring “secondary” hashtags found in the Group 1 and Group 2 data:
One downside to focusing on hashtags is that while these “pro-Trump” (and “anti-Hillary”) hashtags were part of the election-related conversation, Twitter is a small part of the overall news sphere. At the same time, these accounts are sustaining a level of activity and influence that is well beyond the capability of any single person.
These highly active #Election2016 Twitter accounts appear to still be active, and are tweeting thousands of politically-themed posts per day. Not all that exciting, but definitely notable. In the networked politics of the future, the deployment of advanced automation strategies will become standard fare for campaigns seeking to shape public sentiment.