Eight brief thoughts about fake news & truth

7 min readDec 2, 2016

“There’s no such thing, unfortunately, anymore , as facts.”

CNN Political Commentator & Trump supporter Scottie Nell Hughes

In the innocent early days of 2016 I thought the planet‘s greatest threat was climate change, but now I’m not so sure. Out understanding of what’s true not only shapes our response to climate change, but our relationship with almost everything, between people and the state, humans and our history, our struggle to get to this point.

We can only imagine how quickly the lessons of history could be unthreaded and forgotten when all information is equally-valued noise, where a sophisticated-sounding argument is always available to counter-punch any fact. it wins because it’s better presented or has a pithy presentation. “I read it on Facebook from someone I respect, and I can’t trust big corporate media, so this version of reality is at least as likely to be true” has weight. It’s not just social now, but search — Google ‘race baiting’ and the search engine highlights as a definitive definition a term from the somewhat subtly-titled Conservopedia.

MIT’s Dan Schultz, creator of Truth Goggles, explained to me a key finding from his PHD thesis and exploration of web-driven fact-checking was “people want to consume information that confirms their sense of their identity”. In other words, we’re biased in favour of information that supports how we see ourselves; some of the problems here are hard-wired..

But we can still make life easier for people. The Guardian doesn’t publish Onion stories within its pages with the same typeface, and claim it to be news–yet social platforms share all sources with the same design. Trust is inferred by the person who shares, not the creator of the source material & never the platform. This has created a market for identity-reinforcing mis-information, a social problem with democracy-undermining potential. Where does it end up? Why invade a country when you can misinform it’s citizens and install whoever you want as leader?

I’m only just beginning getting to know this area–my reading is limited and I would be very grateful for any links of things to read in the comments–but for the time being these are some current thoughts.

1. This is not a new problem

Mel Gibson’s Braveheart tells its audience the English took Scottish women by force on their wedding night; in the commentary to the film Gibson admits that was actually Roman practice but he’d thrown in to make the English more villainous. I’ve chatted with a number of Scottish Nationalists who were still seething over this bit of false history. The Protocols of the Elders of Zion is still cited by some as proof of a secret Jewish conspiracy to control the world, despite having been established as fraudulent in 1921, long before the Nazis used it in their campaign.

The Sun and the Daily Mail describe Jeremy Corbyn dancing at the Remembrance Sunday war memorial — the un-cropped photo shows he was walking with an old war veteran. Politicians have become so associated with focus-group-honed spin that the least-spinable politicians gain mass appeal, and yet they can still get spun by the press.

So we shouldn’t pretend this is something that just emerged with the web . If anything, the Internet could offer a chance of trying to correct a human history of misinformation.

2. The web was always designed to be a structured data space.

The text on this page is largely meaningless to a computer unless it is skilled at natural language processing. Tim Berners Lee always intended the web to manage information meaningfully, through structured data, though it wasn’t until 2001 that the full proposal for a Semantic Web was published. By adding meaning to pages of HTML, software would be able to potentially source-check and post-corrections. If the statement ‘£1 is worth $1.21 with today’s exchange rate’ is annotated correctly, a user-agent/browser can recognise what it means and correct the data to the latest figure (if wanted).

To get to this new web, the W3C wanted to clean-up markup and make an orderly transition to XML via XHTML. But the collaborative and creative possibilities of Web 2, empowering people without HTML skills to upload and share, became the web’s next phase, adding to the bugs of Web One a bunch more.

3. Is global fact-checking just waiting for the schemas to be agreed?

A commonly agreed syntax for marking-up data would allow all the fact-checking organisations, software and initiatives ensure their work was compatible. This obviously is harder than it sounds, given the difference between organisations and approaches but it seems some kind of standardisation may be needed to support collaboration.

With XML, when you add structured data to a page (ie a list of people’s addresses in such a way that a mapping application could place them correctly on a map), you link to a schema which defines the terms of the relevent metadata. For instance, a page of videos might reference the Dublin Core Schema which has terms for copyright holder, director or running length –and which a machine can interpret accurately (in other words it turns a text file into a database with extractable meaning). As fact-checking is about adding data to existing data — be it a specific file or a general statement (‘apples are citrus fruits’ ) — then a metadata schema could be a starting point.

4. Tread careful: the risks here are great

Everyone I’ve spoken to involved in this space says the same thing. Truth is effected by the biases of the reader and those involved in its dissemination. Very little can be declared as absolute fact, merely the best explanation we know to date.

As soon as machines are using metadata to rank and filter news that’s shared and search-indexed there’s a huge risk of mis-representing and censoring news unintentionally. There’s an even bigger risk of a system being used as its own silencing. There’s no point developing a solution to let big platforms better identify false news if it’s simply created new methods for people wanting to distort truth (I wrote some more on that here).

5. Basic & automated annotation

The simplest level of annotation would be a boolean yes/no: has this meme/stat/statement been verified? So at present everything online would be a no. Further annotation that could be somewhat automated could include categorisation — is this a satire site like the Onion, a company news page, personal blog, private news company, etc. Where is the original source? Does the original publisher follow an editorial code of conduct, do they have corrections, a news desk phone-number? This stuff could be drafted as a public index.

6. Traffic lights

Quite a few I’ve spoken with propose traffic lights. At the simplest level, we want to be able to add a clear red flag if something is definitely false, and a green one if definitely true (a correctly attributed quote, for instance, could be green). And then the vast majority would be in a grey amber zone — drifting between the two extremes.

7. Trust ranking?

Is there a way to rate those shades of amber without giving false certainty or doubt? I’m not sure. FullFact, for instance, not only doesn’t give trust ratings or categories for different qualities and their reasoning makes sense — it’s nearly impossible to do so consistently and accurately. Maybe traffic lights are enough.

There is, though, the problem of an amber ranking wrongly inferring the equivalence of two very different kinds of uncertainty, where scientists may conclude this is the best hypothesis we have to date, while many who write in public (looking hard at myself) may use flourish or inference around statements to make our point.

But how would you create a fair scale? If it’s done through any form of voting (80% think this statement is wrong) you’ve demonstrated opinion, not truth, yet Wikipedia edit discussions often seem (from the outside) subject to majority rules. It’s hard to imagine anything working better than human editors, strict guidelines and some kind of transparent appeal process.

8. Constant, iterative improvement & feedback

Any system is going to be flawed. The process for challenging or tagging content or claims has to be open enough to be fair, but closed enough to be accurate and not a troll/lobby-firm attractor.

FullFact train volunteers and pass each fact-check through a rigorous process of multiple editors to OK and sign-off something. They were commended by both the Leave and Remain campaign for their fact-checking services, a point of great pride to them. The greater automation of any such system create a huge number of potential consequences.

To conclude

Zuckerberg might not have to pay journalists or be accountable for his algorithm’s editorial decisions, but Facebook (and Twitter and the rest) are news organisations, in that they present a selection of news. He may have a fiduciary duty to his shareholders to deny this because the costs involved would be much higher, but it’s increasingly hard to deny a platform replacing many people’s direct interaction with TV and print news media–by providing them with ‘news’–is a news organisation.

And if people log in and see only fascist, extremist, far-right or far-left lies, then Facebook, for that user, is an extremist propaganda newspaper. If they get no other news, they could easily be radicalised through these lies. In print form, in many countries, it would be banned or monitored and of course countless repressive regimes have used news they can’t control as an excuse to crack down on all social media. Anyway, it’s not an easy problem to fix, yet not trying to fix it doesn’t seem an option unless we wan’t to rapid-rewind to less-informed, more ignorant times.