2 Critical Misunderstandings Halting The Prevention of Fake News

Stephen Jefferson
Oct 22, 2018 · 11 min read

Up until now, I’ve purposely avoided getting into the weeds of fake news. It has appeared to me as a temporary nuisance to the wider impact of journalism’s challenges with sustainability. However, after seeing millions of dollars spent to fight fake news, as well as thousands of people dedicating time to fact-checking, and yet the nuisance continuing to dominate major headlines, it seems to me that something is being missed.

I took time over the past year to investigate fake news more closely — and by more closely, I mean more universally. I ended up identifying some of the universal problems that it’s related to by looking more into root causes rather than just outcomes. I’ve found that a majority of current efforts and perspectives focus on resolving just the outcomes of fake news rather than preventing it at its root, so I’d like to share my findings here to help others grasp a more thorough understanding on why fake news continues to be alive and well.

Before I jump into the details, I want to illustrate my perspective of how efforts have approached fake news, how they’re problematic, and the proven strategies that are literally at our fingertips.

Illustrating the problem: Let’s go to lunch

Imagine you arrive at a deli for lunch, walk up to the chef’s counter, and order a sandwich. They make it, package it, and hand over your desired sandwich with a friendly farewell.

You then walk to the cashier, they ring you up but you hesitate before opening your wallet and ask the cashier: “How many calories is in this sandwich?”

  • How comfortable would you feel asking them?
    It might feel awkward, with low expectations…
  • What type of response would you expect?
    Maybe one that’s uncertain, surprised, confused…
  • And, how reliable do you think their response would be?
    Probably not too accurate or trustworthy…

The reliability of the cashier’s nutritional knowledge of your specific sandwich depends on the chef who is accountable for sharing information about the sandwich’s ingredients and overall nutrition. It would be foolish for someone to rely on any person besides its producer for that type of information. Let’s call this The Cashier Problem.

Relying on the cashier to know the nutrition of your sandwich is the same as relying on a platform to know the trustworthiness of a news article — same goes for anything other than a content’s publisher. Their knowledge is limited. Sure, the cashier can try to provide general information about the sandwich from its name or ingredients but this doesn’t necessarily help answer questions about its nutrition.

Now, let’s back up a bit… Imagine asking your question to the chef or, better yet, the farmer who both helped produce the sandwich. This would feel more trustworthy, right?

In the 1960’s, the food industry saw this exact scenario happening. Deli cashiers and other culinary workers received questions and complaints from consumers who needed to know more about the inner-workings of packaged foods. But it got to a point where the frequency, redundancy, and depth (sugars, calories) of questions became unsustainable for any deli to satisfy. The industry ended up advocating for a new method to accurately measure these insights and make them immediately accessible for the consumer.

“…as increasing numbers of processed foods came into the marketplace, consumers requested information that would help them understand the products they purchased (WHC, 1970). In response to this dilemma, a recommendation of the 1969 White House Conference on Food, Nutrition, and Health was that FDA consider developing a system for identifying the nutritional qualities of food.”

Source: Institute of Medicine, 2010

This advocacy helped form the regulation for what we now commonly know as the Nutrition Facts Label that’s required on packaged foods. As the Label gained adoption, consumers could more easily understand the nutrition (or lack thereof) of their food, questioning less and trusting more. Of course, there were people who put typos, sales gimmicks, or fraudulent details on the Label, which we could say represents ‘false information’ but this became less of an occurrence as new conditions were applied. The Label allowed for industry stakeholders to communicate what details about a food item were true and false, having a standard way of making information accessible using a universal language.

With this story, we begin to see a resemblance between this problem encountered previously in the food industry and the fake news problem seen today in journalism. The common universal problem being — a lack of information made accessible to determine truth from falsehood. There could be other sub-universal problems tied to this but I’ve found it continues to circle back to the main universal problem of inaccessible information.

Differences between solutions

What’s most insightful about identifying the universal problem, is that we’re now able to learn from other incidences where it has occurred and consider their attempted and proven solutions. Other industries have certainly encountered this problem before, such as Drug Facts Label for medicine and hygiene products, and the Care Label for garments and other fabric. And there are many other resources that have been developed over time to help consumers easily get answers to their questions or for distributors to more clearly understand what they serve consumers. These solutions are able to transfer insightful knowledge from the producer to the consumer while informing distributors and other intermediaries. This knowledge gives everyone a better understanding of the details about an item that comes directly from the source and isn’t conveniently available otherwise.

If the food industry were to have taken the approach that many fake news efforts are taking today, we would have likely seen an influx of funds and time put towards other prospective solutions:

  • Manual Fact-Checking: Hiring cashier assistants that would run your sandwich back to the chef or look in a recipe book to get its nutrition facts. Very nice of them to do this, but it’s very time-consuming.
  • Or, Automated Fact-Checking: Developing scanners to read nutrition levels from the conveyor belt. That would be cool to see, just like we’re hopeful for AI, but likely unreliable knowing the ever-changing diversity of items at a grocery.

These can be seen as top-down approaches to the problem, rather than bottom-up approaches. They could certainly help alleviate the problem at that given time but the actual problem would continue to emerge. A case pointed out by Poynter this past July states “Fact-checkers have debunked this fake news site 80 times. It’s still publishing on Facebook.” And a recent publication by MIT Technology Review comes to the same conclusion stating “Even the best AI for spotting fake news is still terrible.”

The mission that led these other industries to their final solutions was to find a way to prevent the problem from happening altogether, which required a bottom-up approach. When was the last time you encountered a nutrition question that couldn’t be answered from its Nutrition Facts Label? The problem is likely a very rare occurrence today. What I believe needs to happen in order to prevent the problem of fake news spreading altogether is to identify and implement its bottom-up approach. From learning how current fake news efforts are positioned and where people’s attention and perspectives are currently focused, I see two things that are holding us back from this preventative approach:

  1. The wrong people are being held accountable: (i.e. The Cashier Problem) Before platforms can reliably identify and filter fake news, there are tasks required by publishers that are currently not being done correctly.
  2. Facts aren’t being made accessible: Facts must not only be accessible and legible for humans to understand but also for algorithms.

Let’s dive into these to better understand where they currently stand and potential strategies to overcome them.

Misunderstanding #1: Accountability

This past May, I attended a conference at Bloomberg in New York City titled Machines and Media, whose panels’ expertise spanned from editorial to technology and whose discussions primarily revolved around AI in journalism. However, when fake news and trust were questioned, there seemed to be a tunnel vision that continuously put 100% of the blame on platforms and their algorithms. From my experience at this conference, as well as in other sources and research studied this past year, I consistently see a lack of good reasoning for this answer on accountability. It has become too instinctive for us to blame platforms. We can’t just stop at the outcome of problems seen on platforms, we must look deeper for their root cause.

This reminded me about The Cashier Problem and the fact that platforms and their algorithms (the cashier) depend on information provided by the content’s publisher in order to do their job. If factual data is not easily accessible for algorithms, they will continue to be incapable of reliably filtering fake content.

There are practices and tools that hold publishers accountable to provide factual information in an article, but a majority of those efforts only focus on making it legible to humans, not algorithms. This is the second misunderstanding that I’ll get to in the next section.

The main process that I’d like to get across here is that there are multiple steps for accountability, each with there own separate actors who are accountable at different steps:

  1. The content’s publisher is initially accountable for providing the information with defined facts that they hold to be true.
  2. The content’s distributor or any intermediary is then accountable for filtering or relaying the content based on the information provided and the needs of its consumers.
  3. The content’s consumer is then accountable for being media literate to understand the content served to them and outcomes for engagement.

Each step brings on new tasks and an accountable actor for those tasks. The reliability of truth for a piece of content can be identified by starting at Step #1 and moving forward as tasks are proven completed. Each actor follows suit from the former and, therefore, the success of the latter is dependent on the former. If a publisher fails to properly define facts, the distributor is at risk of spreading false information, and the consumer is at risk of receiving that false information — and further problems arise. Therefore, in this case, the distributor is not at fault, but rather the publisher is. Alternatively, if a publisher has properly defined facts according to standard and the distributor does not filter or communicate them effectively to a consumer, it is the distributor’s fault and not that of the publisher or consumer. Understanding this process allows us to more effectively identify the earliest cause of a content’s potential falsehood. We can then identify the root cause and reasonably hold the right actor accountable for completing their tasks.

Today, the process is at Step 1. The publisher is to be held accountable for not just checking facts and including them in the written article but, most importantly, properly defining them according to standard digital requirements. Let’s go more into detail about this…

Misunderstanding #2: Defining facts

This has been one of the more challenging points to get across this past year. When bringing up the problem that “facts aren’t being properly defined”, I usually get a response like “But I thought they were…” It seems there are invalid assumptions and misunderstandings for what’s the correct way and what’s not. I don’t want to blame publishers for not knowing the correct way because I’ve also spoken with developers who aren’t aware of the standard practice either.

Currently, facts on most news websites are defined with a link, footnote, or plain-text citation, which satisfies a human but are insufficient for algorithms. No matter how many links or footnotes are added to the article, their legibility for algorithms remains the same — slim to none. It’s only when that information is coded into semantic metadata that algorithms can begin to reliably filter, sort, and present that information for consumers (i.e. Step 2).

I’ll clarify this by explaining that the quality of an article is split into two — editorial and technical. The publisher’s mission should be to make an article legible for a human (editorial) and make it legible for a computer (technical). Favoring one over the other lowers the performance of the affected side. Unfortunately, despite today’s digital-first economy, technical quality continues to be ignored or ill-prepared for in journalism and, therefore, has opened up a wide variety of issues, not just fake news.

In order to improve the technical quality of news, publishers need to learn and adopt the metadata standard for defining facts — ClaimReview by schema.org. This standard has actually been available since 2011 but is extremely rare to find websites using it today.

There are 3 reasons why I believe this has been slow to adoption:

  1. The schema takes technical understanding and expertise to implement. In 2017, The Trust Project proposed improving the legibility of technical language for Schema.org. And only until recently, new tools such as this Wordpress plugin and Fact Check Tools by Google have been made available to guide people to use the schema on their website.
  2. The schema has only recently been adopted on search engines like Google and Bing and is surprisingly still under private development at Facebook. It’s a chicken-and-egg problem — publishers likely won’t adopt it until they have a good reason, so major platforms must take that first step.
  3. The schema needs more encouragement and proofs of concept. Google and Bing have expressed how Fact Check Labels now display in their search results but there’s more effort needed to answer if these labels improve trust or engagement and if their algorithms prioritize results differently. There are browser plugins that crowdsource encouragement from consumers but they are new and have yet to make a large impact.

Even once publishers begin to implement this standard and platforms begin filtering and presenting that information, there’s likely going to be revisions to make it more reliable. This will take time. For example, currently, there’s not much guidance on how to insert the most appropriate values for ClaimReview, most fields are open-ended. If you need to define a fact about a police report, you can simply type its source. In the future, as platforms refine their algorithms to automatically validate facts, ClaimReviews about police reports will likely require direct links to police department database records.

Further Conclusions

What we’ve learned here is that making truthful information more accessible in a universal language assists in the prevention of problems for producer-distributor-consumer situations. There is already a standard for websites to define facts and make that information accessible to humans and algorithms, but it’s not well understood and adopted. The perspectives of many people in journalism are not looking deep enough to understand this responsibility and, therefore, are wasting time holding the wrong people accountable.

Overcoming the two misunderstandings of Accountability and Fact-Defining is critical to moving forward in a more reasonable, goal-oriented direction to prevent fake news. Out of the majority of discussions I’ve encountered, it seems that many people are still unsure what to do next. I hope this outline breaks down the situation more clearly, giving more defined roles and responsibilities to the people and organizations currently involved.

I‘ve found that the universal problems and other industry solutions I wrote about are not widely known but I hope can be shared, studied, and considered by more people in journalism in the near future. I believe there is much to learn about incidences that have encountered similar problems in the past, allowing journalism to avoid making the same mistakes that other industries have made and creating bottom-up solutions faster.

If this made sense to you in any way, please pass it along to a colleague you work or talk with. The more we can share our knowledge and perspectives with others, the better decisions we can all make.

Stephen Jefferson

Written by

Working for local journalism and communities. Founder of Bloom (www.bloom.li), 2016 Tow-Knight Fellow. Washington DC