It’s a “Google-drenched society”, but we still suffer from an information drought

Google should help the public better understand news sources

Eni Mustafaraj
6 min readDec 19, 2018
The First Thanksgiving 1621. By J.L.G. Ferris, painted in 1932. This image is part of an assignment in historical thinking by the Stanford History Education Group (SHEG).

There is much to learn from Sam Wineburg’s latest book, Why Learn History (When It’s Already on Your Phone). In one chapter, he discusses how Howard Zinn’s magnum opus, A People’s History of the United States, may undermine the teaching of history by neglecting to expose the challenging process of reconciling contradictory sources.

Wineburg is interested in finding ways to familiarize students with the methods — which he names “intellectual moves” — that historians use to reason about sources and evidence. In one anecdote, Wineburg shares how one student, Ramon, feels like a cheater after glancing at sourcing information at the end of the document before reading content in its entirety. In his words:

Rather than celebrating sourcing as an astute act of historical reading, here was a future history teacher who considered it illegitimate to examine a document’s attribution before analyzing its contents.

Sourcing, as this short video from Wineburg’s Stanford History Education Group explains, is about asking when a document was written, who wrote it, and why, before engaging with the document. It is a crucial step in assessing credibility of information. However, Wineburg’s research suggests that young people struggle with this task, dispelling the myth of “digital natives”. In general, most users (young and old) adopt a time-saving cognitive shortcut: rely on the information Google displays at the top of its search result page, as this aptly named paper, “In Google We Trust”, explains. However, while Google makes it easy for users to find content, it is not that helpful for conveying information about the sources that produce content on the web.

Take for example Top stories in Google search results, which I have been auditing recently. Whenever you search for people or events in the news, Google shows as the first result a panel with Top stories (Figure 1). These stories come from various news outlets, some of them well known and others not. Let us consider the three sources for the Top stories in Figure 1.

Figure 1: Screenshot of Top stories for “michelle obama”. This result was captured on Oct 8, 2018, at 4:08 PM EST. Three stories are shown, but one can scroll horizontally to see up to ten stories.

Depending on where you live (in the U.S. or Britain), you might recognize abc7NY and The Independent and if not, googling for them leads you to their Wikipedia pages. However, there is no Wikipedia page for BYU-I Scroll.¹ As our previous research has shown, only about 38% of news publishers in the U.S have Wikipedia entries that show as Knowledge Panels in Google.

How to verify that BYU-I Scroll is a legitimate news source, instead of an impostor website, as the many fake news sites from 2016? Here I want to turn this question around. Why should every Google user have to do this verification? Isn’t it Google’s job to make sure that only reliable sources are shown in Top stories, especially after the debacle of including news from 4Chan in Top stories? Google hasn’t shared details about how sources for Top stories are selected, but for Google News (which also show news stories from thousands of publishers from around the world), there is a long verification process that takes 1–3 weeks and involves human judgment. We can assume that as part of this verification, Google is collecting some kind of information about these news sources.

Google should make some of this information easily accessible to users, either as part of the Top stories card or by making the names of the sources links that lead to verification information shared on a dedicated Google page. Users shouldn’t have to be burdened with verifying news sources that Google has already verified.

Why should Google share information about news publishers in Top stories?

  1. The large number of unfamiliar sources displayed in Top stories. Google shouldn’t direct users to unfamiliar sources without providing first some basic information about them (e.g., full name, location, category). As described in this post, I’ve collected several thousand Top stories panels. Here is a list of news publishers (more than one thousand) that I found. The majority of these names are unfamiliar to me and I suspect to most other readers as well. Our analysis indicated that the sources that are displayed in the 1st and 2nd positions of Top Stories are usually major national news outlets, but the 3rd position is often a local or niche source. It seems like Google is trying to encourage us to burst the “filter bubble” by presenting us with local sources (like BYU-I Scroll, a student newspaper in Idaho).
  2. Inform the public about news sources and news production. When searching Google for news, users are directed to the websites of various types of news organizations: national broadcast TV stations, their local affiliates, radio stations, print newspapers, personal blogs, magazines, watchdog organizations, etc. Providing the source’s category builds awareness of the rich and varied landscape of news production. Not all sources have the same journalistic standards or the same news gathering means. By making their category explicit , Google could help the public think more deeply about how news is produced and the nature of these stories.²
  3. Avoid intentional or unintentional confusion through name variations. The list of sources I shared contains many duplicates (highlighted in shades of gray), that is, the same website is shown in Top stories with different names. Sometimes the differences are very minor, maybe an extra space or letter (The Hill and TheHill), or a longer name is used that better captures the identity of the source (LEX18 Lexington KY News versus LEX18.com), but sometimes the purpose of different names is not clear or beneficial. Some examples are shown in Figure 2. The Tennessean has a news story about midterms, but instead of its name, the card shows “Policing the USA” , which is not a news organization that exists. Similarly, there is a story about Jeff Sessions from ABC 33/40, a TV station in Alabama, that apparently calls itself “Alabama’s News Leader”, which is obviously not their name. Finally, there is CBN (Christian Broadcasting Network) which appears in our dataset with five different names (lines 181–185 in this document). It is puzzling that a news organization needs to use five different variations of their name.
Figure 2: Three Top stories cards (from different queries), which show sources that use different names to sign their stories.

There are more reasons than the ones I’ve listed here, but these strike me sufficiently important to warrant more contextual information from Google.

In Conclusion

Google is continuously indexing the content of several thousands news sources world-wide, this is why it is able to show “fresh” news on its Top stories panel, often only a few minutes old. If Google “trusts” these sources, it should share some reasons for this trust with its audience. Since the fake news crisis of 2016, Google has been investing hundreds of millions of dollars in strengthening journalism. However, in this blog post, I’m advocating for a feasible intervention in Google’s interface: Google, turn Top stories into an educational tool! Don’t only use it to show news headlines, but also to provide additional information about sources that created the news. This would be a way to demonstrate by example the importance of considering the source alongside the content, because news doesn’t come from nowhere. Wineburg’s book³ invites us all to learn to think like historians, by always considering the source of information. Google can and should help with this task!

Footnotes

[1] BYU might be a recognizable acronym to many Americans. However, to me, an immigrant, it was unfamiliar.

[2] Google adds the word “Satire” or “Fact-check” to articles in these categories.

[3] The phrase “Google-drenched society” is borrowed by Wineburg’s book, see page 3.

Acknowledgement

I’m grateful to my research collaborator Emma Lurie for insightful discussions to improve this blog post.

--

--

Eni Mustafaraj

Data and Web Science | Wellesley College | Immigrant | Feminist