Facebook, Google and False News

The approach Facebook and Google are taking to False News, how they differ, how it could be improved and how publishers can help

Published in

glitch digital

5 min readApr 7, 2017

Facebook’s VP of News Feed Adam Mosseri sharing insights about the News Feed with Jeff Jarvis at the International Journalism Festival in the Sala Dei Notari, Perugia, Italy

This week at the International Journalism Festival in Italy we’ve heard from Facebook about what they are doing to tackle what they are calling False News in their News Feed.

Terminology

False News a term Facebook have adopted that’s intended to a better job than the more familiar term “Fake News” of distinguishing between articles that contain unintentional mistakes or minor factual inaccuracies, “misleading content” and “false content”.

What Facebook are doing

The system Facebook described at a panel on Thursday morning involves highlighting articles that have multiple reports from Facebook users to a team within Facebook, who will attempt to judge if the site in question appears worthy of escalation to an external team they are collaborating with who will conduct an independent review.

If they review the page and find it to contain disputed facts, a message indicating the link contains disputed content (and, in more recent interations, who disputes it) will appear under links to that page from Facebook.

Additionally, anyone attempting to post a link to a page that has been contested will be shown a message warning them it is disputed and asked if they are still sure they want to share it.

Facebook are also introducing tips to help people spot false news at the top of the News Feed.

What Google are doing

Today, we’ve also seen Google do a wider roll out of their system for flagging pages that have been fact checked by selected organisations.

Both systems surface warnings to readers in similar way, by displaying alongside links from Facebook page or Google search results, but the implementation of how each system works behind the scenes varies and there some obvious gaps which haven’t been picked up on in sessions in which they’ve been discussed.

The differences

Facebook’s system relies on human escalation, first by readers to staff internally and fromt them to external reviewers and existing fact checkers who then flag if it’s disputed by their trusted sources like Snopes and Associated Press.

Google’s system is similarly driven by authoritative independent sources, like PolitiFact and the Washington Post, but is is driven by stuctured data schema defined on schema.org which anyone can implement on their site.

The format, offically called the ClaimReview schema, is currently in draft but is an open specification in use today, and follows the same format used by search engines and other software to describe like Events, Quick Answers and News Articles on webpages in ways they can be understood by machines.

An example of the ClaimReview Schema in JSON-LD

The schema defines a format for markup that can be embedded in articles to describe, in a machine readable format, details such as the URL (or URLs) being disputed, what exactly they are disputing about it, who is disputing it, how they rate it, what sort of scale they use for rating, etc.

To understand how using the ClaimReview schema works, it’s worth noting that anyone is able to dispute something (AP, BBC, CNN, FOX, RT or Breitbart…) and each publication is free to set it’s own score, define it’s own scale and cite their own sources.

With ClaimReview, which sites to trust is ultimately an editorial decision for platforms on which links are shared. Platforms can cherry pick “trusted sources” from well known organisations their readers trust, aggregate ratings from a list of verification sources, use metrics to determine which sources are “trustworthy” or they can allow readers to select which sources they trust.

Automation can catch stories before they go viral

By proactively checking the URLs of disputed articles as soon as they are disputed elsewhere (before they are flagged by readers, reviewed by gatekeepers at Facebook and passed on for external review) the reaction time for highlighting disputed stories could be cut dramatically, reducing their impact and the amount of visibility they get.

If you have an archive of the text of disputed articles and reviews, it’s also possible to go one step further and automatically detect minor variations and iterations of new articles as they are copied and morphed across new sites, stopping debunked stories from re-surfacing before they have time to grow and automatically flagging new stories that appear similar to previous False News stories for immediate review, to catch them early before they go viral.

Unfortunately there isn’t currently a muti-sourced, publicly acessible index of disputed (and disputing) pages to help make this possible.

This means Facebook would need to create their own index or rely on someone like Google make their index available (either publicly to everyone or directly to Facebook under a private arrangement).

News publishers can help

Publishers could help by providing dedicated RSS feeds of “ClaimReviews” and by providing ways of searching them by URL and by text (in away that they could be checked by both people and machines), which would facilitate the automated checking of URLs with trusted sources on other websites and on other social sharing platforms.

An open index would be extremely helpful

Alternatively, instead of Google or Facebook running a platform to track fake news, someone could take on the roll of maintaining an open, public index as a service, to make it easier for any platform or reader to instantly check any URL shared on their platform against multiple sites.

This would need funding to set up, but wouldn’t be a particularly expensive operation, and roughly comparable with the budget for a small to medium size Google Digital News Initiative Project.

I’m not suggesting automation is a magic bullet for combating the spread of false news stories, but it’s a lot more efficient to pool resources this way and much faster (and effective) and less error prone than relying on a manual review process for escalation.