Trump versus Biden: what Twitter data reveals about the supporters of presidential candidates

Biden supporters are twice as likely to follow Starbucks or Netflix while Trump's crowd rather takes a sip of Black Rifle Coffee. What can we tell about the supporters of each candidate based on Twitter data?

Jitka Baslova
Emplifi
9 min readJul 27, 2020

--

We explored Twitter users who actively engaged and expressed their opinion in the Twitter debate with the main US presidential candidates and discovered affinities of different opinion groups and their behavior on Twitter.

Methodology and sample

We’ve looked into US Twitter profiles that most recently mentioned Donald Trump or Joe Biden in their Tweets, and on whose mentions the Socialbakers algorithms detected positive or negative sentiment.

When gathering the sample data, we excluded mentions that contained more than one @handle, spam messages, messages with undetected sentiment, profiles of media or brands, and profiles that didn’t have stated US location. Profiles following less than 20 or more than 5,000 profiles were excluded too.

After clearing the data, we got 15k + 15k profiles mentioning each candidate. and narrowed the sample to 20k random unique Twitter profiles. We gathered lists of all profiles they follow (Twitter API calls them “friends”) and linked these lists to the Socialbakers database. The database includes inhouse data enrichments, such as profile categories, and it allowed us to look at the sample data in a broader perspective.

To get the groups of supporters of each candidate, we used a common clustering k-means algorithm with the help of tf-idf and t-SNE.

t-SNE visualization of users who mentioned one of the candidates

First, we created a matrix of all profiles (~ 1.5M) found in the gathered lists of friends and applied binary tf-idf to detect distinguishing profiles across the sample.

Due to computing reasons, we ignored profiles followed by more than 50% of and less than 2.5% twitter accounts in our user sample. In the end, only 3,047 profiles from the lists of friends remained and we worked with those.

To see if there are any clusters in the data, t-SNE was applied. It revealed that while there were user dots clearly grouped into a few clusters, there were also dots spread across the visualization.

Subsequently, we performed k-means on the tf-idf matrix and colored the t-SNE output by the k-means cluster labels. It showed that the k-means provided sufficient results, but we might need to run it multiple times on the biggest cluster (spread dots) to get more complex results.

Interests and affinities behind the dots

t-SNE visualization colored according to k-means labels

The clustering algorithm revealed 3 clusters. Two of them clearly defined strong supporters of each candidate (green for Trump, blue for Biden). The third one (pink) at first didn't seem to incline to any of the candidates even though the dots appeared to be located closer to the Biden’s cluster.

However, we further explored the pink cluster and found Bernie Sanders’ supporters, Democratic-leaning profiles with weaker affiliation to Joe Biden and other Democrats, and a group of profiles that didn't show any particular political affinities nor strong interest in politics.

Interestingly, k-means showed that the sentiment of mentions themselves wasn’t a very good indicator for determining each candidate’s supporters. Overall, the conversation is rather negative on both sides (neutral mentions were excluded during the sample collection).

The mentions were gathered during the period between March 1 and May 29 and arranged in descending order, prioritizing recent ones.

Trump's vs. Biden's supporters

Both Trump's and Biden's core supporters show a stronger interest in society than the other profiles in the sample. Trump’s base has a weaker affinity towards celebrities and stronger towards media and community pages (community category include, among others, unofficial pages of political parties, religion pages).

Both groups are highly interested in politics on Twitter. While Biden’s fans tend to follow more NGOs, Trump’s fans follow more government-owned profiles.
Biden’s base in general follows more writers (the category includes journalists) and actors. Trump's fans are following rather sports stars, singers, and broadcast stars.
Both Trump's and Biden's followers are way more active on Twitter than users with less strong political affiliation. They both follow and are followed by more profiles and publish more tweets. However, Trump's fans are significantly newer to Twitter; they joined approximately 4.5 years ago, which is roughly the time of the first Trump's campaign. All values in the chart are medians.

After analyzing words in profile descriptions of each group, we knew that the clusters contained strong supporters of each candidate.

Biden’s supporters (on the left) include in their description an anti-Trump hashtag “resist” and other pro-Democratic keywords, such as bluewave, liberal, biden, voteblue. The “biden” keyword is a bit overshadowed by the “trump” keyword — the cause could be that these profiles position themselves more as “anti-Trump” rather than pro-Biden. Trump’s supporters describe themselves as conservative, patriots and often include #MAGA or #KAG.

Profiles dividing the groups the most

Trump’s supporters show clear interest in the current administrative (@whitehouse), while Biden’s supporters still follow profiles of Obama’s administration (@vp44, @obamawhitehouse) and, for instance, are more interested in WHO, which was criticized by Trump during the COVID-19 pandemic.

Interestingly, there are more Joe Biden’s supporters following Adam Schiff or Barack Obama than the actual Joe Biden's profile. Trump’s supporters show a stronger affinity to their candidate, nearly 97.5% of them follow the @realdonaldtrump account.

The clustering algorithm revealed not only differences among the profile topics, political parties and politicians the supporters usually follow, but also proved that their diverse opinions are reinforced by highly different lists of specific media outlets from which the supporters gather information. Their sources include TV shows, celebrities, or even NGOs, with only a minor flow of information coming in from the opposite side.

The biggest differences in the media followed by the two groups are present in Trump promoting One America News and Breitbart News, while Joe Biden’s supporters follow left-leaning news outlets, such as MSNBC, The Washington Post or The New York Times more often.

When it comes to NGOs, celebrities, and other segments, the political polarization projects into the differences shown in the following chart.

There are some differences among sports teams and organizations the supporters follow, even though their affinity towards sports isn’t particularly strong. Biden’s supporters follow women’s soccer, while Trump’s are more likely to follow NASCAR or, not surprisingly as Texas is traditionally a Republican-leaning state, Dallas Cowboys.

Although brands are not commonly followed by any of the analyzed groups, there are still some tendencies we can see.

For example, Biden's supporters are twice as likely to follow Starbucks or Netflix. Only about 1% of Trump's followers in the sample follow Nike, compared to almost 5% of Biden's supporters, even though Trump's fans seem to be slightly more interested in sports (as seen previously). This may be the effect of the 2019 dispute in which Trump urged to boycott the brand. Not surprisingly, the most followed brand among Trump supporters is The Trump Organization.

Less opinionated profiles

As stated at the beginning, we also explored the pink blurry cluster of the initial k-means result as the visualization was suggesting there might be at least one clear cluster among the people who didn’t show any strong affiliation with Trump nor Biden.

We performed a second round of k-means on the remaining profiles which led to the discovery of supporters of the former Democratic candidate Bernie Sanders (red).

The blue cluster gathers Democratic-leaning voters, and the yellow cluster groups people who are generally not interested in politics on Twitter (and they prefer sports and celebrities) and don’t show clear affiliation with any party.

When it comes to the activity of Twitter users who are not Trump’s or Biden’s supporters, the least active is the yellow, “Unknown Preference cluster” (if we can call the spread dots a cluster). On the opposite, Bernie Sanders’ supporters turned out to be the most active ones, although they don’t reach the numbers of Trump’s nor Biden’s supporters.

The word cloud made of words in the profile descriptions of Sanders’ supporters confirms the affinity towards Sanders. Bernie, socialist, progressive, or #NotMeUs are among the most distinguishing keywords (TF-IDF method).

As seen from the clustering visualization, the two remaining clusters are a bit blurry. Even though we later discovered some distinguishing features, the word clouds don’t provide much information.

On the left we see words from the Democratic-leaning voters cluster while on the right there are keywords coming from people who don’t seem to be interested in politics as much as other profiles in the sample.

Among Bernie Sanders followers, we don’t see a strong affiliation with the winning Democratic candidate Joe Biden. However, among the 10 most followed politicians we can see another candidate, Elizabeth Warren.

It’s interesting that among Sanders’ supporters the most followed politician is Alexandria Ocasio Cortez, a strong supporter of Sanders, but not Sanders himself.

On the other hand, among the Democratic-leaning profiles, we see Joe Biden on top together with other famous Democrats. The list gathers current or former Democratic presidential candidates. However, the affinity towards Democrats is not to be taken for granted as the 10th most followed politician is Donald Trump.

The last political ranking shows people who can’t be affiliated to any of the political parties nor candidates. The visualization of the clusters showed that the remaining, yellow profiles are spread across the chart, meaning they don’t show many common features.

Their overall interest in politicians is weaker (smaller percentages) than among other clusters. When looking at the politicians they follow, the top two are Donald Trump and Barack Obama. As they are overall one of the most followed Twitter profiles, we can’t draw any conclusions based on that.

However, compared to the Democratic-leaning profiles, we can spot that the difference between the percentage of users following Donald Trump and Barack Obama is radically smaller here and in favor of Trump.

Summary

Based on public data from Twitter we defined two main groups of users, supporters Donald Trump and supporters of Joe Biden. We analyzed the data and discovered that Trump's supporters show a strong affinity towards their candidate as almost all of them follow the @realDonaldTrump Twitter profile and many follow profiles of the cabinet. The most popular brand among Trump's supporters is The Trump Organisation.

Supporters of Joe Biden are slightly less connected to their candidate on Twitter as 85.25% of them follow @joebiden, but actually even more of them follow Barack Obama (93.24%) or Adam Schiff (86.28%). Yet, we also discovered some other potential voters of the Democratic candidate who haven't shown that strong connection to Joe Biden.

Based on the data we worked with, crucial for the election results might be supporters of Bernie Sanders, who haven’t found a way to either of the nominees quite yet. Only 17.85% of Bernie Sanders supports follow Joe Biden, while 27.55% follow Donald Trump.

Disclaimer

All the data we worked with (mentions of the candidates and information about the profiles who post them) is publicly available via Twitter API. We didn't use any personal information and we didn't provide the list of profiles to any third party.

Sounds interesting? Check out more of our stories & don’t forget we’re hiring!

--

--