#TrumpWon? trend vs. reality

a deep dive into the underlying data

Gilad Lotan
i ❤ data
Published in
7 min readSep 29, 2016

--

Why is everyone so obsessed with this hashtag and the fact that it was in Twitter’s trending topics list the morning after the first presidential debate? Perhaps the competitive nature of a presidential debate — the fact that there’s supposed to be a “winner” — means that we’re reading into any available data point. Maybe due to the nature of this specific election cycle, where facts seem to have become subjective, as people in online echo-chambers consume what they want to believe.

Or maybe because Trump gleefully announced his status as #1 on Twitter trends, as if implying this means he won the debate.

Later in the afternoon things got much more interesting, as rumors spread about a potential Russian connection, specifically pointing to the image below. The map suggests that the #Trumpwon trending topic had started in St Petersburg, and its source (@DustinGiebel) claimed that he grabbed it off a service called TrendsMap.

The image was shared over 14,000 times, later debunked by a number of sources, including the Washington Post.

https://twitter.com/DustinGiebel/status/780814613021548544

Last year

and I published a lengthy analysis on Media Hacking — highlighting ways in which players are finding smarter and more elaborate ways to gain attention. We described the importance of understanding one’s position within a network, which helps filter out noise, and identify an underlying agenda that may be driving communities of interest. Over the past year we’ve been building a new network-based data product — Scale Model — which has recently launched out of . I decided to use it to figure out what happened with this hashtag.

Reality is Subjective, Data is Truth

In order to make sense of this event, I started off with two datasets. The first- a list of locations where the hashtag #TrumpWon trended, sampled from the REST API every 5 minutes (raw data here, timestamps in UTC). The second — a list of the first 100k tweets posted to the hashtag.

For starters, the hashtag never shows up as trending in any Russian city. When we use a dispersion plot to view the data, it is clear the trend began in Baltimore and Detroit, but very quickly jumped to ‘Worldwide’ status, and pretty much stayed there for a few hours, as it began to spread across Australia and the UK.

Dispersion plot of earliest locations where #TrumpWon had trended.

The x-axis represents 10-minute time intervals, where the first was taken at 6am Eastern time. The hashtag reaches the `Worldwide` trending topics list by 7am Eastern time. From studying Twitter’s trending topics algorithm in the past, it was understood that both novelty and acceleration were important factors in attaining a high trend score. Additionally, Twitter takes into account one’s IP address when calculating trends. The `Worldwide` category is a form of aggregate, looking at novel and rapidly accelerating terms and phrases being used across users across all regions.

Typically you see a trend jump from one city to another before reaching country-wide or world-wide status. This was not the case here. There was a group of highly-organized users who all posted the exact same message at around the same time, from (seemingly) different geographic locations. These exact same message was published by thousands of accounts, likely all over the world, a few hours after the debate had ended (between 3–5 am UTC time).

This is the piece of content that was shared:

https://twitter.com/Trump_Videos/status/780616703478726657

By the time folks got up in the east coast, Twitter timelines were filled with these prompts to tweet #TrumpWon —likely generating enough acceleration for the hashtag to reach worldwide trend status.

Who are these people?

I used the list of users who were first to post to the hashtag as an input to build a community on Scale Model. Below we see a network graph of these users and their connections (who follows whom). The larger a user is the more important they are within the context of this group.

Network Visualization of top user segments from the initial users who posted to the hashtag in an organized fashion.

Each color represents a different cohort — places in the graph where there’s much more density of connections. Each user cohort is then labeled based on unique yet important words and phrases associated with the users in the group. In this case, we see different flavors of Trump supporters — #MAGA (Make America Great Again), Trump, #Trump2012 and #TGDN (a tag used by conservatives to identify each other as a mechanism to boost follower numbers). Many of these accounts have words such as God, America, Family, Proud, Wife, Mother, Father and Veteran in their bios.

The five clusters of Twitter users who made #TrumpWon a trending topic (viz: )

A quick plot of the distribution of sources for the tweets shows a decent split between platforms, making it less likely for this to be automated / bot activity.

Finally we can take a look at the actions these users are taking — what content are they sharing, and how active are they over time (notice the massive spike — in blue- when the hashtag was shared by many of them).

The view above gives us the ability to effectively get into this group’s mindset — see the world through their lens. If we immerse ourselves in this world of Trump supporters, we learn the following:

  • Trump apparently has a 4 point lead over Clinton.
  • Hillary apparently cheated in the debate by sending hand signals to Holt (notice how active the comments are on this page).
  • And Alicia Machado, former Miss Universe who came up during the debate, was “apparently” accused of threatening to kill a judge and being an accomplice to a murder in Venezuela (dailymail.co.uk).

I’ll stop here, but I think you get the point.

What’s clear is that there’s dense connectivity and a clear understanding of how information flows across participants. Users in these groups not only follow each other at significantly higher rates compared to the general Twitter user, but also clearly know who is a hub — who has the ability to accelerate the flow of information.

There’s also continuous reinforcement of the dominant hubs within the network:

And actively surfacing content to spread:

On the other hand

Let’s take a look at the account that tried to spread the false information about Russia’s involvement. We can do this by seeding a Scale Model community with the followers of @DustinGiebel.

We are presented with a parallel world of propaganda — this time anti-Trump.

In this case, user cohorts are labeled — #NeverTrump and #ImWithHer, where top shared images and links include:

  • Trump apparently didn’t pay for $100k worth of pianos. (WaPo)
  • A company controlled by him apparently conducted secret business in Cuba during Castro’s presidency. (MSNBC)
  • A post by a self-proclaimed “Bernie Bro”-turned Hillary supporter (aplus.com)
  • And Trump denying that he said he didn’t pay Taxes, right next to him making that statement in the debate (CNN)

As you can tell, very different vibe.

What’s still unclear is who exactly photoshopped that image to make it seem like there was a Trump-Russia connection, and what else they have up their sleeves. What we’re seeing with this hashtag, is a highly organized group of interconnected accounts, dedicated to making their agenda as visible as possible.

Trending topics are helpful as they cut across information silos, gaining significant levels of attention from people who would otherwise never see your content.

The winner in this quest for attention and frame reaps huge rewards.

On the other hand, we’re seeing how false information can spread like wildfire, especially when there are enough people invested in making it true.

We have a few weeks to go, let’s see where this madness takes us!

Questions, Thoughts? Find me on Twitter — @gilgul

--

--

Gilad Lotan
i ❤ data

Head of Data Science & Analytics @BuzzFeed, Adjunct Professor @NYU