IMAGE: Lorelyn Medina — 123RF

Putting the trash out where it belongs

Among the challenges facing journalism these days is the abundance of pages based on clickbait, sensationalism, yellow press, and plain disinformation. In response, Facebook just announced it will start burying news from pages it classifies as low quality in its content recommendation algorithm.

Junk pages have grown rapidly, fueled largely by algorithms such as Google’s: what was once an interesting algorithm based on posting external links and based on the academic world’s citation index, used for several centuries, became, thanks to the popularization of social networks, a metric that paid more attention to what people were reading than the quality of what they were reading. Worse still, likes, shares, retweets, etc., which users marked without thinking much about the implications, turned a metric that measured relevance into something that gave an increased relevance to information that surprised, attracted attention or intrigued.

How a relevance metric goes wrong by moving to a supposedly more democratic indicator-picking mechanism is a topic I find fascinating: in 1990, when Google began to gain popularity, the metrics it collected were links to pages with keywords (the anchor text) qualified in turn by the number of incoming links on the original page, along with a few correction factors. For Larry Page, using an algorithm he was familiar with: the citation index that indicated to him like some doctoral student if an academic paper was relevant or not based on how often it had appeared in the references of other papers, was a great idea. The system worked: not only did it achieve a metric of convincing relevance, but it also removed control from sites, which made it less manipulative than the previous attempts. With the popularization of social networks, Google detected that mechanisms like Facebook Like or Twitter trending topics could become relevant metrics more quickly than its own, and began to base its algorithm more and more on these types of parameters. The result of privileging these metrics meant pages were filled with Likes, Shares and retweets, followed by more and more pages focused solely on generating those metrics. To all intents and purposes, a misconception of relevance.

Over time, Google seems to have been correcting this mechanism: social parameters have been decreasing, and the company seems to be evolving toward something like KBT, which we have talked about on other occasions, to determine a page quality metric, and, consequently, of its relevance. In short: you cannot privilege in terms of relevance pages focused solely on clickbait.

The big problem with junk pages, however, are their heterogeneity: they are difficult to define, or in some ways like the famous resolution of a US judge who tried to explain whether or not a content was obscene by saying: “I know it when I see it.” Facebook’s move is clearly a way to try to determine precisely that: whether a page is garbage or not, based on a definition set out as “pages containing little substantive content, covered with annoying advertisements, overly impactful or malicious.” And if it really is, to try to avoid giving it visibility thanks to its algorithms.

Junk pages are the web equivalent of sensationalist journalism and tabloids: it has its followers, and in some countries generates important business. On the Internet, the phenomenon has expanded to unsustainable limits. Measures such as those taken by Facebook or Google are fundamental to curb their development and to improve the quality of the web. But we have the final responsibility, because although we are free to share what we wish, we should also think about whether we want to be associated with such content.

Media that took the short-term approach and used these tactics to raise their visibility face a hard time, but these measures have to be taken, and the trash put where it belongs.

