When social trust is the attack

How bad actors exploited social media algorithms to erode users’ confidence in democratic institutions

Published in

Cybersecurity for Democracy

11 min readMar 26, 2021

On January 6th, 2021 an armed and angry mob stormed the U.S. Capitol. Poll data gathered shortly after the insurrection indicated that millions of Americans believed that election fraud had occurred on a massive scale, resulting in the theft of the election. How could so many people be led to believe something so clearly false, when there was so much scrutiny and oversight of the election process?

While we can’t draw a bright line from one false statement or one bad actor to the attack on the U.S. Capitol, what is undeniable is that in the months leading up to it, we all saw steady attacks on our sense of shared reality about the 2020 election–our common understanding of what is real and what is not.

We’ve heard a lot about the role of disinformation–the intentional sharing of harmful messages–in the U.S. Capitol attack, but what do we know about the systems that helped these messages reach and influence people? I believe that what this was was a social engineering attack intended to spread false beliefs online, which I call a “user social trust” attack. We see clear signs that U.S. voters were subject to this attack, and I think this may have been a contributing factor to so many people being deceived during the 2020 election.

In the cybersecurity field, social engineering attacks are ones where an attacker seeks to exploit users themselves via networked systems. You are probably already familiar with social engineering attacks via email, such as phishing scams. Research has repeatedly shown that attacks such as phishing can be more effective when executed through social media instead of email. Like other forms of social engineering, user social trust attacks rely on human vulnerabilities to succeed. Also like other social engineering attacks, content promotion algorithms used by social media platforms can be amplifiers of these attacks.

We’ve heard a lot about the role of disinformation–the intentional sharing of harmful messages–in the U.S. Capitol attack, but what do we know about the systems that helped these messages reach and influence people? I believe that this was a social engineering attack intended to spread false beliefs online, which I call a “user social trust” attack.

Over the course of the last year or so, actors seeking to delegitimize the election process aimed to degrade users’ trust in mail-in voting and the election as a whole. They did this by appealing to users’ trust in their own social networks, and then promoting false beliefs in ways large and small. Once attackers are inside the walls of a user’s social trust network, they weaponize harmful false beliefs to convince the user to take a specific action. Many of these actors appear to have political goals, but they can also have commercial ones–scams and other assaults on people’s financial security.

The goal of this piece is to describe and propose a model for evaluating this type of social engineering attack. I’m going to do this with examples that illustrate different facets of user social trust attacks, some of which promote false beliefs about mail-in ballot fraud, and some of which promote other misleading or harmful content. I believe that many social media platforms are almost completely unprotected from these types of attacks on their users. If that is correct, and given that we all witnessed an incredibly successful attack of this kind in January, more are certain to follow.

User social trust attacks, defined

At the core of all user social trust attacks is an attempt by the attacker to harmfully mislead the user. This can manifest in a few ways–either by promoting a harmful false belief or by misleading the user about who the attacker is in order to claim the mantle of affinity. The other necessary piece of this attack is an appeal to the user that is rooted in the users’ trust in their own social network. When I use the term “social network,” I am referring to the communities of people, either in-person or online, that users belong to, not exclusively those on social media platforms. Social networks are often centered around some shared metric of identity, such as ethnicity, geography, or profession.

What are “harmful beliefs”?

Sometimes, harmful messages actively promote false information, such as the false claim that “Vaccines give your kids autism.” This is a statement that can be disproven by evidence; in other words, it is a falsifiable claim. However, just as problematic are more nebulous statements promoting false beliefs that cannot be disproved by evidence, such as “I just don’t trust vaccines!”

Subtle messaging like this also takes advantage of a quirk in human psychology. Rather than providing a claim that will be logically evaluated, the user is invited to relate to the feelings of the author. The user who sees the statement is more likely to identify with the emotions, positive or negative, that come with the message because it comes from someone claiming a common identity. In either case, the message is leading the user to the false belief that vaccines are dangerous.

Here’s another example: During the election period, many advertisers were promoting the false belief that mail-in voting was rife with fraud. This idea is simply not true, but these ads were successful in seriously degrading trust in the credibility of the election.

These Facebook ads that ran in the lead-up to the 2020 elections promoted the idea that voting systems were rife with fraud.

There were also many ads promoting the false belief that Democrats were getting ready to start a civil war. Here are a couple of examples:

Ads on Facebook promoted the idea of widespread left-wing violence.

Sometimes, the attack is about misleading users by posing as a member of a community and then sharing a message about what they say other community members believe. While these types of attacks usually attract less attention than campaigns based on outrageous, false content, they can be just as harmful to users.

Platforms cannot rely on automated systems to identify which beliefs are false. We need humans to make these judgments through a well-articulated and transparent evaluation process that then informs platform policy. For the foreseeable future, deciding which beliefs actually qualify as “harmful” is going to be a manual task, perhaps best left to platform oversight boards or other bodies that have the capacity to listen and weigh the competing interests of different user groups. However, articulating models for evaluation will help provide the information that humans need to make these judgments, by providing means to screen large bodies of content and flag potential harmful messages.

Platforms cannot rely on automated systems to identify which beliefs are false. We need humans to make these judgments through a well-articulated and transparent evaluation process that then informs platform policy…However, articulating models for evaluation will help provide the information that humans need to make these judgments, by providing means to screen large bodies of content and flag potential harmful messages.

Appeals to social trust

So how do attackers get users to accept false beliefs? In short, they hijack users’ trust in social networks or institutions. They will also often narrowly target users within a particular social network because an attack designed for one group is ineffective or obvious when it is seen by out-group members.

Attackers seeking to exploit users’ trust in social networks or institutions will usually claim to be members of those groups. Often, they do this by misleading the viewer about who they are, but not always–sometimes they really are authentic members of the group. Different ways to claim common group identity include:

Setting up a Facebook group claiming an identity that the organizer may or may not really share;
Paying an influencer who has credibility within a particular network, without disclosing that payment;
At an extreme, creating a new social group and identity for users, such as QAnon has done.

What all of these cases have in common is that the attackers gain a user’s trust through a claimed shared trait, or through group membership, as opposed to, for example, an appeal to authority or to expert knowledge. Attackers are exploiting a weakness in human psychology, an effect that social psychologists call in-group favoritism. This describes how users will be more accepting of a message if it comes from a user they think is “like them.” This is particularly true if the message is somehow connected to that shared identity. This is also the core tactic of ‘affinity scams’, a type of off-line con that targets members of tightly-knit or insular communities.

These Facebook ads contain identical images and text, but claim different group identities.

Gathering Together was a Facebook page targeted at, and claiming to be run by, African-American women. Union Patriots, seen in the ad on the right, presented itself as a totally separate entity, for and by union members. In reality, both were part of what appeared to be a centrally controlled network of pages active in the lead-up to the 2018 midterm elections. These pages were used to build up a following centered around different identities. They then promoted political messages to these groups in bulk, without disclosing who was paying for the messages. I never did find out who was running them! This is a classic example of an advertiser falsely claiming to be a member of a particular group for the purpose of gaining the trust of the user.

This ad from a conservative Super PAC claims to represent union members.

This ad, versions of which also ran on TV, appears on a page run by a super PAC, America First Action, which supports conservative candidates. FactCheck.org described this campaign as “misleading.” The man in the ad is labeled as “Shawn, Union Man, Democrat”, but American First Action doesn’t provide any evidence that this is a real person. This ad is clearly aimed at union members in Pennsylvania. By putting their words in the mouth of a “Union Man, Democrat”, the advertiser hopes the message will be subjected to less scrutiny.

Very often, if a piece of harmful content is narrowly tailored to the tastes of a single social network, its problematic or false nature is extremely obvious to people who are not in that group. But if an attacker narrowly directs the content so that it is only or primarily seen by members of that group, such scrutiny can be avoided. This is why micro-targeting can be such a powerful tool for spreading harmful content. It both gets an information attack to users who are most vulnerable to it, and it also helps avoid detection by others who would expose it. When this feature is part of content, it is often seen working in tandem with a claim that the source of the content is a member of the group that is being targeted.

The ad targeted at married women in Maine proclaims that Joe Biden is “dangerous” and that violent crime is exploding.

The above 2020 ad by Donald Trump was targeted at married, suburban women in Maine. CNN fact-checkers described the ad as “dishonest.” It was part of a campaign by Trump to tell a particular group that their specific communities were threatened by Joe Biden. This claim and the messaging of this ad were designed to scare a suburban, married, female audience, and the ad’s targeting parameters were designed so that only women who fit that demographic were meant to see it.

This ad does not name “Pepe the frog,” or “Q”, but the association is obvious to a QAnon audience.

This QAnon Facebook page was using what I call “content targeting”; it’s since been taken down. It uses QAnon iconography, such as Pepe the frog, and textual references, such as “Cue” for “Q,” which means something to a QAnon audience but are not recognizable from a mainstream audience.

Understanding these vulnerabilities helps us better describe the “attack surface” that needs to be defended. Platforms allow outsiders to target content by social or identity groups, opening themselves up to such strikes. Facebook ads and groups are the obvious cases here, but there are others that are less intuitive, such as smaller platforms that really only cater to a moderate number of non-overlapping groups of users. Platforms such as Parler or Gab were set up explicitly for this purpose and are well known, but others like NextDoor fall into this category as well.

Toward solutions

It’s beyond the scope of this piece to cover how to mitigate these attacks in detail. But the features of these attacks do suggest options for a process for screening problematic content. For example, because we know that attackers seek to exploit users’ trust in other members of their social groups, we can subject content from anonymous or unverified users to additional scrutiny. Because we know that micro-targeting can be abused to deliver tailored attacks to closed social groups, we can limit the use of ad micro-targeting for certain types of content. Most importantly, once a harmful belief has been identified, content promoting that belief can be prohibited.

As long as the conversation remains solely about content of messages, rather than focusing on the behavior of attackers, information attacks against users will continue to proliferate. The fact is that the specific false, dystopian beliefs detached from reality that QAnon adherents hold are symptoms, not a cause of our current disinformation pandemic. The actors behind these groups exploited the fears of millions of people. If instead of believing that Bill Gates was putting nano-chips in the COVID-19 vaccine, they believed Dolly Parton was summoning Satan via 5G towers, there would be no practical difference. So if we are going to combat these attacks, we must focus on the patterns of action of attackers, rather than merely the textual fictions they promote.

As long as the conversation remains solely about content of messages, rather than focusing on the behavior of attackers, information attacks against users will continue to proliferate.

The good news is that by analyzing social media in this way–as potential information attacks against users themselves–it is possible to create systemic solutions to filter problematic content for further review. Just like mapping the genotype of a virus as a first step before formulating vaccine candidates, if we describe the features of a social trust attack, we can develop effective tools to screen and mitigate the attack. To be clear, such a systematic approach will not be easy, or cheap. Platforms will fight any approach that changes the way business is done. But the status quo isn’t sustainable, and everyone knows that. One way or another, social media companies are going to be held responsible for the content they host and promote. They need to take steps that will actually work to reduce the flow of online disinformation, and that requires a focus on systems rather than on content.

Thanks to Damon McCoy, assistant professor, assistant professor of Computer Science and Engineering at the New York University Tandon School of Engineering, who contributed his expertise to this piece.

Cybersecurity for Democracy is a research-based effort to expose online threats to our social fabric — and recommend how to counter them. We are part of the Center for Cybersecurity at NYU.