Democracy Hacked

A Massive, Pro-Le Pen Disinformation Campaign Hits Twitter, 4chan, and the Mainstream Media

Kris Shaffer
Data for Democracy


By Kris Shaffer, CE Carey, and Ben Starling.

Twitter users have started to notice something strange in the lead up to this Sunday’s presidential election in France. Hashtags like #SortonsMacron (Get Macron out), #RejoignezMarine (Join Marine [Le Pen]), #Macrongate, and #Bayrougate have at various times begun to trend on Twitter. The Digital Forensics Research Lab has reported that so far, these hashtag trends have remained relegated to relatively isolated groups of Twitter accounts ― including actual individuals, sockpuppets, and bots. However, over the past few days, we have been monitoring activity on a variety of platforms and are finding that the disinformation efforts are ramping up on Twitter and 4chan in particular, and starting to enter the mainstream media, as well as Wednesday evening’s presidential debate. This is a significant, coordinated, online effort to sway the election in favor of the far-right candidate Marine Le Pen, just like we observed in the US presidential election and the Brexit vote in the UK.


Over the past 24 hours, we have streamed and analyzed over 1 million tweets relating to the French election. (Search terms include multiple spellings of both Le Pen’s and Macron’s surnames, and the hashtags #LeDebat2017, #2017LeDebat, and #Presidentielle2017.) Here’s what we found. (Though please note that out of respect for individuals’ privacy, we will not be publishing personally identifiable information of any specific Twitter users.)

Who’s tweeting?

In our million-tweet database, we found a number of highly prolific accounts. 5% of users accounted for a full 40% of the tweets. The most prolific account tweeted 1668 times in the roughly 24 hours of data ― that’s faster than a single (re)tweet per minute, all day with no sleep. While that’s humanly possible for a highly motivated tweeter, we manually watched this and several other high-volume accounts during the day yesterday, and for several of these accounts, the tweets were coming through in bursts too fast for an individual to keep up with them, suggesting automation rather than a highly active human. These accounts also feature other traits consistent with automated bots: overly patriotic banner images, profile pictures stolen from elsewhere on the web, a recent join date combined with a high volume of tweets, etc. It’s clear, then, that a significant amount of content ― particularly among the highest-volume accounts ― is coming from bots.

Few of these high-volume accounts are tweeting original content, however. Almost all of them are either retweeting content from “catalyst” accounts, replying to tweets of others with a single, repeated message (usually a meme or a picture with text), or quoting tweets of others and appending a single, repeated message to each.

What are they tweeting?

Perhaps surprisingly after the US presidential election and the Brexit vote (in which the bots were heavily in favor of the right-wing options), we encountered both pro-Le Pen and pro-Macron bots. We also encountered a get-out-the-vote bot that seemed to favor neither candidate.

However, in spite of there being bots on both sides, we also found evidence of a significant disinformation campaign aimed at the discrediting of Macron and encouraging votes for Le Pen. In the following graphic, we can see the bigrams (two-word phrases, after common words like a, an, the; la, le, les; etc. have been removed) that are most distinctive of tweets that mention on candidate or the other.

Most characteristic bigrams in over one million tweets from May 4–5.

There are several findings to note in this analysis. First, there is significant English-language content in tweets about this election. The election is international news, but this is a new phenomenon. Over the first few hours of tweets, the most distinctive phrases in tweets about Le Pen contained significant English and Spanish, but Macron’s most distinctive phrases were almost exclusively in French (see graphic below). The increase in English is a new occurrence.

Most characteristic bigrams from over 400,000 tweets from May 4.

It’s also worth noting that the tweets are generally dominated by French-language content. By our rough estimates, about 80% of the tweets are written in French, just 20% in English. That the most distinctive content for each candidate tends to be more commonly in English is suggestive. Without further, more detailed analysis, it’s hard to say for sure, but it’s possible that English-language tweeters are at least trying to drive the narrative for each candidate, or introduce new narratives. Whether or not they will be successful is another matter, but after Trump/Clinton and Brexit, we don’t take anything for granted anymore, and recommend taking every even potential attempt at disinformation or narrative control seriously.

There has also been a significant shift in the content dominating these tweets. Last night, tweets about Macron focused on Barack Obama’s endorsement of his candidacy, while tweets about Le Pen seemed dominated by her being struck with an egg at a campaign event. This morning (US East Coast time), however, tweets about Macron have shifted significantly, now focusing on his supposed corruption, including tax evasion via an account in the Cayman Islands.

This corruption charge, dubbed #Macrongate by some, is a deliberate disinformation campaign, one that has spread through multiple social media channels, amplified by bots and “influencers”, even making it onto the debate stage, and now the French courts. Just today it was compounded by an alleged dump of emails from the Macron campaign, reported in international mainstream media ― too big for the media to ignore, but too late for anyone to examine closely for content or authenticity, and minutes before the campaign blackout began, after which point candidates are not permitted to respond until the polls open.


When we saw 4chan and pol (the “politically incorrect” board on 4chan) pop up in the list of most distinctive phrases in tweets mentioning Le Pen, it grabbed our attention. We’d already seen references to Le Pen rising from relative obscurity on 4chan in April to the top 10 bigrams over the past few days. 4chan played an instrumental role in Trump’s election, as they produced and audience-tested many anti-Clinton memes and fed the ones with the best responses into the The_Donald subreddit, which at the time was being monitored by the Trump campaign for material to recirculate in more mainstream social media channels.

We looked specifically at the 2000+ tweets in our archive mentioning 4chan. While it’s a drop in the bucket on Twitter, these tweets turned out to be the key to a big controversy. According to BuzzFeed writer, Ryan Broderick, 4chan users have been propagating a (now debunked) claim that Macron is using an offshore bank account in the Cayman Islands to evade French taxes. Following up on this story, Broderick also found evidence that Reddit users were purposefully repeating identical phrases about this conspiracy theory in order to “Google bomb” ― to feed false, verbatim content into sites Google mines to feed their search engine algorithm, in the hopes that they can influence the phrases that Google uses to autocomplete searches beginning with “Macron”.

Though 4chan is full of confusion over the authenticity of the claim, or whether its authenticity is even relevant, the story propagated there and on Reddit has made its way around other social media channels and has made its way into mainstream French media. Marine Le Pen referenced it in the debate, accusing Macron of using a tax haven. During the debate, #Bahamas was a trending hashtag on Twitter, as a result. The effect has been that Macron has had to answer for this claim in interviews, much like Barack Obama had to answer questions about his birth certificate and Hillary Clinton about her email server, and it detracts from Macron’s ability to stay on message even if it is false. In a close election, even if the majority of voters believe the story to be false, the fact that it affects his campaign message may be enough to tip the scales towards his opponent.

Enter the bots

As the #Macrongate “story” entered public discourse in France, pro-Le-Pen Twitter bots have taken to amplifying it. For example, one disinformation post about this story, “Are Emmanuel Macron’s Tax Evasion Documents Real?” (on GotNews, not linked from this post for ethical reasons) was the top non-Youtube URL shared on 4chan during the period under investigation. In our Twitter dataset, it was shared 1186 times by 1066 unique accounts. Upon manual inspection, the top sharers of this article demonstrate multiple bot-like traits (see above). What’s more telling is 306 of the 1066 accounts which tweeted/retweeted the GotNews article are in the top 1% of accounts when ranked by tweet volume. In other words, the most active bots are disproportionately sharing links to disinformation. While the more generic “Get Macron out” and “Join Marine” campaigns analyzed by DFRLab ultimately proved unsuccessful, it seems the tax evasion disinformation campaign has gained traction, and right-wing bot-nets are seeking to capitalize on it, keeping the false story in public discourse as much as possible leading into Sunday’s election.

Narrative control

It’s becoming clear that the goal here is not the truth, nor even believability. (Remember #pizzagate?) The goal of these disinformation networks is to control the narrative. Keep the ball on the other end of the field and take lots of shots, even bad ones. Control the ball, keep your opponent on the defensive, and they can’t work their own offense. And who knows, they might slip up and let the game winning shot through. That’s the playbook of these disinformation networks.

So what do we do? Demand the truth. Ignore the latest unconfirmed email dump from WikiLeaks. Learn what a bot looks like. Check any unsubstantiated claim with multiple reliable sources. Better yet, treat those unsubstantiated claims with scorn to begin with. And go on the offensive. Push the truth. Repeat it over and over. Share it as aggressively as the bots spread their disinformation. Demand high standards from journalists, and when they fail to meet them, find something else to read. And demand that the platforms who enable the spread of disinformation and hate online be held accountable.

The web is ours. Democracy is ours. It’s time we took them both back.

Header image by Pedro Kümmel.



Kris Shaffer
Data for Democracy

Data Scientist. Computational musicologist. Digital media specialist. Developer. Author. University of Mary Washington, Hybrid Pedagogy.