First Draft’s head of policy and impact, Tommy Shane, explores how our technology affects what misinformation we do and don’t see.
“The ocean twilight zone is a layer of water that stretches around the globe … below the ocean surface, just beyond the reach of sunlight … the twilight zone is cold and its light is dim, but with flashes of bioluminescence — light produced by living organisms. The region teems with life.”
There’s been a lot of debate recently about “Facebook’s Top 10,” a Twitter account that lists “the top-performing link posts by U.S. Facebook pages in the last 24 hours,” managed by The New York Times’ Kevin Roose.
Given that conservative pages tend to dominate the results, the lists have been used to argue that Facebook is biased in favor of conservatives. Facebook, in turn, has pushed back, arguing that engagement doesn’t equal reach.
Irrespective of this argument, “Facebook’s Top 10” points to wider issues about what we see and don’t see in misinformation research. And they go beyond what data we can access, and which metrics we look at.
How do analytics dashboards shape what we see online? What if, by focusing on posts with the greatest engagement, we are missing the things bubbling underneath? Could we be looking in the wrong places and missing real harm, simply because our tools make some things harder to investigate and study?
These were questions our researchers grappled with in our recent research into vaccine misinformation.
To find a solution, we took inspiration from the history of marine biology.
The twilight zone
In 2004, marine biologist Richard Pyle delivered a talk about his research into the ocean “twilight zone.” Pyle discovered that researchers had been focusing on the very top layer of the ocean’s depths. “[We] know a lot about that part up near the top. The reason we know so much about it is scuba divers can very easily go down there and access it.”
The problem was that one can only scuba 200 meters deep. Biologists were well aware of this, and so used submersible vehicles to go deeper. But this created another problem, as Pyle explains: “If you’re going to spend $30,000 a day to use one of these things and it’s capable of going 2,000 feet … you’re going to go way down deep.”
What Pyle discovered was a middle “twilight zone” — so named because of the limited sunlight that pierces to that level — that researchers had neglected because it was easier to look at the surface, and more enticing to go down deep.
This twilight zone, once registered, became a huge source of discovery for ocean biologists, at one point leading to discoveries of seven new species for every hour spent in that region.
Misinformation’s twilight zone
There are a number of lessons here for social media research. We tend to study the accounts with the largest number of followers, the ones responsible for huge engagement metrics. We see network graphs of trending hashtags, dumps of scraped social media data shared by researchers trying to look for evidence of “coordinated inauthentic activity.”
Or we see qualitative researchers lurking in private Facebook Groups, Discord servers or 8kun boards, trying to spot disinformation campaigns before they make their way onto more popular social media platforms.
Both are valuable, but it’s not sufficient for understanding the ecosystem as a whole.
The ocean’s twilight zone is, first and foremost, a reminder that our understanding of misinformation online is severely lacking because of limited data: platforms deny access; ethical guidelines prevent researchers from entering or reporting on certain spaces online.
But more importantly, this maritime comparison is a reminder that our technology can draw us toward seeing some things and not others. CrowdTangle and Twitter’s API are not passive databases that we access, but products with affordances that influence our activity. Some features exist, others do not, and this affects what we see.
the interests of platforms are baked into not just the data they share, but the features they allow for querying it
And critically, the interests of platforms are baked into not just the data they share, but the features they allow for querying it. For example, on CrowdTangle you cannot filter for labeled or fact-checked posts.
Beyond hard limitations such as these, we also need to consider friction — where accessing certain metrics or items is simply made much harder than others. This includes ranked lists that draw us toward the most engaging posts and away from those in the middle zone.
More work is needed to surface feature biases, because we might be missing a critical part of the picture without realizing it.
The problem of feature bias has been raised before. Richard Rogers, a key figure in the development of digital methods, has observed that social media platforms can lead researchers to focus on “vanity metrics” such as engagement scores, rather than “voice, concern, commitment, positioning and alignment.”
But more work is needed to surface feature biases, because we might be missing a critical part of the picture without realizing it.
Applying this to our research
Engaging with the concept of the twilight zone led our researchers Rory Smith and Seb Cubbon to take two critical methodological decisions in their research into vaccine misinformation.
The first, and most fundamental, was to focus on how narratives were evolving and competing rather than on highly engaged posts. The units of analysis in analytics dashboards are individual posts, but narratives are much more powerful than individual pieces of misinformation, shape how people think and can’t be simply debunked.
They also chose to exclude posts from verified accounts as a way of accessing “the middle” of social media activity. The most engaged-with posts were generally from official, often pro-vaccine accounts, such as professional media outlets. Filtering out verified accounts cut through the noise and found more of the anti-vaccine discourse bubbling underneath.
But this was only feasible because there was a feature to filter out verified accounts; otherwise, it would have been very costly to manually exclude them at scale. The filter illustrates our dependence on not just data, but features, and how this affects what we do and don’t see.
In the end, searching for the twilight zone is not a fixed process or location, but a reminder and an endeavor: to think outside the logic of analytics dashboards, and, where we can, look for the neglected parts of the ecosystem.