Honduras: Network visualizations of JOHbots
After I published my last blog about fake Honduran accounts I was contacted by several people saying I should look at all the other bots in Honduran Twitter. I was already aware of previous work about networks of fake accounts that Hondurans call “JOHbots” — named for the President of Honduras, Juan Orlando Hernández.
I captured 17,413 tweets mentioning the handle of Juan Orlando Hernández, @JuanOrlandoH, between December 25 and December 29, 2017 and found networks of obviously fake Twitter accounts. These fake accounts are very primitive bots. They are clearly fake and for the most part exist only to retweet the President of Honduras.
7,608 tweets from the dataset of 17,413 tweets — or 44% of the dataset — were sent using TweetDeck.
Because there were so many automated accounts in this dataset, I manually separated the tweets sent using TweetDeck from the tweets sent using other common sources (Android, Web Client, iPhone etc) and made 3 user-to-hashtag graphs:
1. Full network (tweets from all sources)
2. Network of tweets sent from TweetDeck only
3. Network of tweets sent from all other sources
Full user-to-hashtag network for 17,413 tweets:
Hashtag network of tweets sent from TweetDeck only (red) vs. hashtag network of tweets sent from all other sources (blue):
Note: it’s possible there are more fake accounts in the blue network, I found some accounts that looked like generic profiles who only retweet other accounts and weren’t using TweetDeck.
I found several clusters of automated retweet networks consisting of obviously fake accounts.
The following network of fake profiles is retweeting @JuanOrlandoH as well as the account of @Zuleyma_Zablah, @HildaHernandezA (Juan Orlando Hernandez’s sister who was recently killed in a helicopter crash), an account called @VidaMejorHN (“Better Life HN”) and several Honduran media outlets.
The accounts also retweet each other’s tweets, most of which are glowing replies to JuanOrlandoH’s tweets. All accounts in this cluster that I’m calling “Team Ladies” were created on either June 10 or June 17, 2015. They have similar cover photos and account stats and all use avatars of attractive women. They also tweet on the same schedule.
One of the accounts, @mariacelestekafaty1, was created on June 10 with the first batch but never tweeted. Perhaps something went wrong with that account because a second account was created on June 17, 2015 — mariacelestekafaty2. The second “mariacelestekafaty” is the active account in this network and the first iteration has remained dormant since its creation.
If you only view one account in your browser or phone, it might not be clear that it’s fake but viewing the timelines side-by-side the automation is obvious.
Zooming in to the user-to-user network above, in the lower left corner (dark blue) is a cluster of Honduran media outlets. Team Ladies is located in this media cluster.
Zooming in to the dark blue area on the pre-render screen in Gephi shows how Team Ladies is connected to the JuanOrlandoH account and also embedded in the cluster of media outlets. They appear in this cluster because they are retweeting the media outlets and (for some reason I don’t understand why) they also mention several media outlets in replies to tweets.
Below are a few examples of tweets from Team Ladies that mention several media outlets. The first 3 tweets are replying to this tweet from VidaMejorHN which does not mention any media outlets but for some reason the fake accounts add the handles of several media outlets in their replies.
The second group of 3 tweets are replying to this tweet from JuanOrlandoH which also does not mention any media outlets but again, the fake accounts add those handles to the end of their replies. I don’t understand why they do that or what it accomplishes but that’s why they appear to be embedded in the media cluster.
As I was wrapping up Team Ladies, I found some fake men that were created in the same batch. Otto Flores, Erick Amaya and Daniel Guevara were all created on June 17, 2015 and all use TweetDeck. These fake accounts follow each other so you can click on the followers for any fake account and find more fake accounts, like a hall of mirrors.
Team Ladies (and men) tweet on nearly identical schedules — from 9 to 5 Monday through Friday with the weekends off. I took screenshots of each of their schedules from Luca Hammer’s account analysis tool and lined them up to compare the schedules.
Since I had the tweets sent from TweetDeck already separated, I made a user-to-user network with just those 7,608 TweetDeck tweets. The clusters of Team Ladies (red) and the fake men (purple) stand out.
Zooming in on the red cluster in the top left corner I found Team Ladies along with several Honduran media outlets.
The smaller purple cluster on the bottom are the fake men, together with VidaMejorHN, Zuleyma_Zablah (whose Twitter bio says “Advertiser, Research and Marketing. Coordinator of Presidential Television Programs”) and a few other media outlets.
Another batch of accounts that was created from December 6–8, 2017 are also using TweetDeck to boost tweets, mostly from the account of @JuanOrlandoH. They also retweeted several tweets that congratulated @JuanOrlandoH on winning the recent election such as the tweets from the Secretaries of Foreign Relations of Mexico and Brazil.
Similar to the first batch of fake ladies, “Team Santos” shows obvious signs of automation that may not be apparent when viewing just one account at a time, but comparing their timelines side-by-side shows the pattern.
Team Santos was created on December 6 and December 8, 2017 so they’re baby bots and the Santos batch is still growing. I found three brand new Santos accounts that were created on December 27, 2017.
All are using TweetDeck with the exception of two accounts whose tweets show iPad and Android as their sources. The Rodr_Sant account (below) registered an iPad as the source of his tweets when I first found him on December 27, however I checked his account again on December 29 and the source of his last tweet changed to TweetDeck. So that was an interesting discovery — apparently some of these fake accounts tweet from common sources like Android or iPad and TweetDeck.
Team Santos is not grouped in a cluster like Team Ladies. Instead they are scattered throughout the dark blue area along with thousands of other accounts that are retweeting JuanOrlandoH using TweetDeck.
Team Santos has a similar tweet schedule, although not identical like Team Ladies. The Santos accounts tweet seven days a week and rest from 1 or 2am until 9 or 11am daily. I only pulled schedules for 6 members of the Santos family because this blog is getting long but I’ll add a list of accounts at the end so others can check them.
If I filter the TweetDeck-only network by edge weight 3 (removing edges that connect only once or twice to JuanOrlandoH, there are still a significant number of heavy edges coming from accounts retweeting the President of Honduras. Aside from a few obvious clusters that stand out in the below network, I thought the uniformity of the rings of accounts around the main JuanOrlandoH node is interesting.
What is also notable is the fake accounts that I looked at are not tweeting excessively. Each account in Team Ladies only tweets 12 to 14 times per day and Team Santos averages around 6 tweets per day, so the filters that I’ve used previously to find accounts that tweet excessively in other hashtag studies don’t work in the case of these fake Honduran accounts.
Just as I was about to finish this blog I found another cluster of obvious fake accounts. “Team Rivera” is made up of 50 accounts that were created mostly on December 7–8, 2017.
Team Rivera is very basic like the other clusters of accounts I looked at so far. Their names follow an obvious pattern, it’s as if someone went through an alphabetical list of names and tacked on “Rivera” to all of them. What’s more insidious in my opinion are the fake accounts that are created to look like average citizens.
I found the Rivera accounts while experimenting with the dataset of 7,608 TweetDeck-only tweets. The repetition and automation in the tweets using TweetDeck is obvious in the raw data from the Twitter API.
Below is what Team Rivera looks like in the TweetDeck-only data. They have no locations in their bios. The repetition is obvious, but I didn’t find Team Rivera by looking through this raw data.
Experimental graph: mapping retweet networks
Normally I make network graphs with users & mentions or users & hashtags as the nodes and edges. I suspected there were more clusters of fake accounts but wasn’t sure how to easily find them all without having to aimlessly click through the followers of fake accounts. I wanted to see if there was a way I could use Gephi to visualize the TweetDeck accounts and make them sort themselves out into their respective “teams.”
After some trial and error, I used the TweetDeck accounts as edges and the time & date of their tweets as the nodes. Since the accounts were retweeting the same tweets at the exact same times, the fakes clustered around the tweets. Each cluster contains several tweets and several accounts, so the node labels are somewhat messy but here’s the final result of 7,608 TweetDeck tweets sorted by the “teams” that tweeted them. There are 81 clusters.
Zooming in to the upper right corner there’s a purple cluster that’s easier to distinguish what it is since it’s only one tweet. The time and date of the tweet is the node in the center and the accounts surrounding it all retweeted that tweet at the same time.
The mint green cluster are 5 members of Team Ladies.
Team Rivera is this magenta cluster.
I found Team Rivera while looking through these clusters, they stood out right away because they’re all named the same. Here’s what the Riveras look like in the pre-render screen of Gephi.
Focusing on another tweet in the same cluster shows how the Rivera accounts are connected to both tweets.
I didn’t go through every cluster but they all seem to fit the same pattern, like the accounts ending with random strings of numbers in this orange cluster.
This pink cluster also has some obvious patterns in the user handles.
This yellow cluster looks like “Team 001.”
More of Team Ladies & men are in this red cluster.
Here’s the larger green cluster in the center of the network, again the user handles follow a pattern.
I called these fake account “primitive” because they are simple and follow similar patterns, as if created from a template and mostly just retweet other people’s tweets.
What I’ve observed so far in Honduran Twitter networks, it seems like a batch of fake accounts is created in 1–2 days, then the fake accounts all follow each other and then they all retweet the same tweets at the same times. This is very similar to the fake accounts in the previous study I did on Honduran Twitter accounts that tweeted a smear campaign against Berta Caceres. It’s as if they don’t care these accounts look really fake.
From the original dataset of 17,413 tweets I worked with for this blog, 7,608 of those tweets were tweeted by accounts using TweetDeck. Some of those 7,608 tweets were tweeted by the same accounts so I made a list of all the TweetDeck accounts, removed the duplicates and ended up with a total of 2,142 accounts using TweetDeck in Honduran Twitter. This total is just from 4 days worth of tweets. I haven’t checked all of them and I’m sure there are more but I checked many random accounts in the list and every single one showed obvious signs of automation.
Most Twitter networks contain a small percentage of automation from 3rd party apps or programs like TweetDeck, that’s normal. But I’ve never seen a network where half of the tweets in the dataset were from TweetDeck. It’s an outrageous amount of automation.
All network graphs in this blog were created with Gephi using force directed algorithms OpenOrd then Force Atlas 2.