Finding misinformation with ‘rumor cues’

Published in

First Draft Footnotes

5 min readFeb 25, 2021

First Draft’s head of impact and policy, Tommy Shane, explores how keywords related to rumor can help us understand and respond to dangerous activities online.

If you’re a reporter, getting your queries right really matters.

In 2015, Daniel Victor of The New York Times was searching for witnesses to an incident on a plane involving a female passenger and a Hasidic Jewish man who didn’t feel comfortable sitting next to her. Victor found that querying for “hasidic” and “flight” on social media brought up a lot of people talking about the incident, but not people who were actually there.

But then he discovered something. There were three words that could identify genuine eyewitness accounts: “me,” “my” and “I.”

“Most people relating a personal experience — [also known as] good sources — will use [them],” Victor explained. “Most people observing from afar — aka, useless sources — won’t.”

For anyone researching social media, skillful query design is critical. Get it wrong and you won’t find what you’re looking for. Get it right and you can discover surprising things that others are missing.

Rumor cues

In this post, we introduce “rumor cues” to describe an approach to query design that, like first-person pronouns, can be a powerful but overlooked entry point into online conversation.

The term builds on insights from research into how rumors spread, and is designed to help reporters and researchers find truth-seeking behaviors online that contain, or are vulnerable to, misinformation.

To explore them, let’s look at a rumor that spread in the early phase of the pandemic, claiming Washington state would go into lockdown.

A screengrab of a rumor that circulated in March 2020. Source: Kate Starbird

In this post there are no hashtags, no conspiracy watchwords, no dog whistles. There isn’t even the word “covid” or “coronavirus.” So how would you find it?

One answer is the word “grapevine.” Another is “heard.” Both introduce the rumor by referring to sources of information.

These kinds of verbal cues typically accompany rumors; researchers have found that others do too, such as “apparently,” “reportedly,” “really?” and “is this true?”

What these words have in common is that they relate to truth-seeking — discussing, evidencing, persuading or questioning what’s true. Like the first-person pronouns “me,” “my” and “I,” words related to truth-seeking — which we’re calling “rumor cues” — can help to monitor misinformation.

Tracking rumor cues is especially important at this particular moment, when networked rumoring can drive life-threatening misinformation, and when routine searches for information online are being weaponized by conspiracy theorists. Both are major social vulnerabilities connected to truth-seeking that we need to identify and better understand.

In this post, we show how reporters and researchers can use rumor cues for three purposes: identifying rumor in real time, newsgathering with shadow queries and understanding the rhetoric of conspiracy theorists.

Identifying rumor

One of the great challenges in tackling misinformation is identifying rumors before they spread. This can help to address deficits: demand for information — often through rumors or unanswered questions — that is not met with adequate supply, creating a vacuum for misinformation.

Crisis researchers have discovered useful insights to help with this. Zhe Zhao and her colleagues, for example, found that expressions such as “Is this true?” “Really?” and “What?” were common in online rumors following the Boston marathon bombings; similarly, Kate Starbird and her team found that the words “apparently,” “reportedly,” “alleged” and other forms of “expressed uncertainty” were commonly associated with rumors.

Other examples might include researchers call “non-specific authority references,” such as “experts” or “doctors,” which are more associated with misinformation. Linguistics research also indicates that terms like “disguised”, “hiding, “show”, “exposes” and “uncovers” feature prominently in discussion of disinformation.

Combining words like “covid” or “lockdown” with rumor cues can help us to sift through enormous numbers of posts to find the rumor in the haystack, and support interventions from credible voices.

Newsgathering with shadow queries

Rumor cues can also be used for news monitoring with shadow queries: slightly inflected search queries, such as “coronavirus truth” instead of “coronavirus facts.” Topic keywords like “covid” can also be combined with Rumor cues, such as “won’t cover this” or “mainstream view,” and words such as “hidden,” “suppressing” and “concealed” to locate unverified counter-narratives.

Shadow queries like these can reveal distinct truth-seeking networks. Working with researchers at King’s College London, we found that two seemingly similar queries, #covidfacts and #covidtruth, uncovered two very different hashtag networks: #covidfacts was linked to fact-checking hashtags such as #factchecking, #factsmatter and #debunking, while #covidtruth was linked with conspiracy hashtags like #dontbelievethehype, #covidhoax19 and #wakeup. The minor inflection of the query with different Rumor cues revealed two separate networks that both sought to establish the truth, but in very different ways and with very different claims.

Analyzing conspiracy rhetoric

Another case for rumor cues is the exploration of conspiracy rhetoric. Researchers have warned for some time that the search for accurate information online is being weaponized by conspiracy theorists in ways that are frightfully difficult to counteract.

To understand the rhetoric that powers this manipulation, I worked with a team of researchers to examine 600,000 conspiracy-related Instagram posts by filtering for rumor cues.

*A word tree visualizing the most common words before and after the epistemic keywords “trust your”; First Draft and Digital Methods Initiative analysis, created with* *Jason Davies’ Word Tree.*

We looked for words such as “truth,” “research,” “evidence” and “trust.” One phrase was particularly common: “trust your.” We found that it was often followed by bodily references — trust your “gut,” “body,” “eyes,” “heart,” “immune system,” “instincts” and “intuition” — indicating a private, embodied form of knowledge, quite a different authority than experts.

Another rumor cue we explored were Bible references, which were often used to evidence and support narratives. We found that Revelations 13:16–17, which makes reference to “the mark of the beast,” was particularly common. According to BibleRef.com, a possible knowledge source for Christians, this may refer to “implanted computer chips or other technology,” echoing a major conspiracy narrative about vaccines.

These alternative authorities to institutional expertise — the body and the Bible — can be uncovered and explored with rumor cues, and can yield insights about how to create effective counter-messaging.

What we mean by ‘rumor cues’

Rumor cues, which we also describe with the more technical term “epistemic keywords,” are any words that can be used to query online spaces or datasets for truth-seeking behaviors, such as rumor or conspiracy theory. These might be words or phrases related to knowing, discussing, evidencing, persuading or questioning what’s true. More precisely, they are queryable traces of epistemic activity in online spaces.

Rumor cues are not a silver bullet. But they can be another tool in your kit when looking for misinformation online.

Thanks to the research team at the Digital Methods Initiative Winter School 2021 for their support in testing out this concept. Thanks also to students at King’s College London, by Jonathan Gray, for their experimentation with epistemic keywords in relation to Covid-19 conspiracy theories in winter 2020 as part of an engaged research-led teaching, with input and support from researchers on the infodemic project and the Public Data Lab.