The sum of all knowledge? Oral citations on Wikipedia

Lucie-Aimée Kaffee
5 min readSep 17, 2019

--

Transfering oral knowledge has a long tradition in large parts of the world. CC-BY-SA

Wikipedia is a widely used resource for people speaking a variety of languages. It aims to be the sum of all human knowledge. However, not only are some languages a lot better supported than others but also what is a valid source to quote is heavily biased. To become a hub of all human knowledge, Wikipedia has to consider different ways how knowledge is captured.

In many communities, most history and knowledge is oral. Especially in the global south, there is a lack of written resources about local knowledge. And even written sources can be heavily biased through the history of how they were produced, e.g., by western anthropologists.

At Wikimania 2019 Felix Nateray, Kimberli and I co-chaired a session on the problems with the current forms of capturing knowledge and which sources we trust on Wikipedia. In this blogpost, I summarize the bias of the current sources in Wikipedia, in particular with regards to oral citations on Wikipedia. The ideas captured here are based on the session at Wikimania, where we got to discuss the topic with the diverse editors of Wikipedias. An unstructured summary of the session is captured in this etherpad.

Graham et al. 2014

Wikipedia’s Citation Bias

The problem of bias in Wikipedia has been covered from a variety of angles: there is a large conversation about the gender gap, both in the editors as well as the coverage of articles. But there is also a growing awareness around the content bias in Wikipedia with regards to the representation of different parts of the world. Generally, the global south is much less covered than other parts of the world. For example, there are more articles about Japan than the entire middle east, even on Arabic Wikipedia. (See Graham et al.’s work for more information on the content bias w.r.t. geographic coverage.)

Many factors contribute to the information poverty that exists for large parts of the world and therefore reflects on the amount of available information for different language speakers. In this blogpost, we are focusing on citations in Wikipedia. The way we cite mainly shapes the way the knowledge is represented in Wikipedia. Wikipedia’s guidelines state that:

If no reliable sources can be found on a topic, Wikipedia should not have an article on it.

While this makes sense for an encyclopedia in the style of Wikipedia, the question of what is a reliable source can be a very political one. In many places in the world, knowledge has an oral rather than written tradition. The form of how knowledge is captured differs between countries and cultures. If a culture’s knowledge is not in a written form (and often also not published in the western understanding as a reliable source), it is currently not possible to include it on Wikipedia.

Books published in 2005 by country
UK: 161,000 books / 60 million people
South Africa: 6,100 books / 48 million people
India: 97,000 books / 1100 million people
Source:
https://meta.wikimedia.org/wiki/Research:Oral_Citations

Titles produced per million inhabitants 2014–2015, for which figures were available. Image: https://www.internationalpublishers.org/images/reports/Annual_Report_2016/IPA_Annual_Report_2015-2016_interactive.pdf

Wikipedia is one of the most accessed websites in the world to learn about a topic. Therefore, it is an ideal platform to bridge the gap between oral and written knowledge.

Oral Knowledge on Wikipedia

Capturing oral knowledge can be a challenging task. When thinking of non-written knowledge, the first solution that comes to mind is to record the people, whose history you want to capture. However, putting this knowledge into context can be more complex than just the recording of someone talking about a topic they are knowledgeable about.

In our discussion, a Wikipedian brought up the example of travelling to Macedonian villages and recording the names of important buildings, such as churches, there. Recording this type of knowledge is challenging, as there is no technical tool to capture common knowledge shared in a community.

Further, when recording oral knowledge, one has to consider the context. As written knowledge often can be seen in its context such as the time something was written, oral knowledge is shared and factors such as the community or language can be very important to transfer it in its entirety.

Currently, there is no form of oral knowledge captured in Wikipedia following the guidelines of sources having to be published in written form. This leads to the discussed content bias, favouring countries publishing in the western form of knowledge capture.

There have been attempts to capture this knowledge anyway. In the above example of Macedonian churches, the attendant explained they take pictures of church walls, in case there are inscriptions of names and dates to use those as a reference.

Another example is a project Felix is involved in: Animating oral history, as videos can be used as sources. In the research around the project, people were interviewed and their knowledge was combined in an animated video about the history of a tribe now residing in Ghana. As a video can be used as sources, this is a first step to bridge the gap between oral and Wikipedia-citable information.

This leads to an overarching problem: Currently, the big Wikipedia versions lead what is an acceptable reference. Wikipedians from under-resourced languages reported they are referred to use their own, smaller Wikipedia to use other forms of references. This leads to less exchange, and the communities estranging. Also, English Wikipedia is then not including knowledge of large parts of the world, as they build a wall around their language version.

Communities that are largely relying on oral knowledge can not find a representation of their knowledge in the current way Wikipedia is written.

This leads to two future directions towards the inclusion of oral knowledge and citations on Wikipedia:

1. Technical implementations of capturing oral knowledge are needed

2. A stronger exchange between the Wikipedia communities is needed, so that inclusion of oral knowledge can be accepted in all Wikipedias

Oral knowledge adds another dimension to Wikipedia that is important to capture in the goal to create a world in which every single person on the planet is given free access to the sum of all human knowledge.

Further Reading

Movie “People are Knowledge” as a result of the research project People are Knowledge (2011): https://vimeo.com/26469276

People are Knowledge project of 2011: https://meta.wikimedia.org/wiki/Research:Oral_Citations

(and media created in the project) https://commons.wikimedia.org/wiki/Category:Oral_Citations_Project

English Wikipedia’s definition of reliable sources: https://en.wikipedia.org/wiki/Wikipedia:Reliable_sources

https://en.wikipedia.org/wiki/Wikipedia:Verifiability#Reliable_sources

Economic times India on including oral citations (2011): https://economictimes.indiatimes.com/tech/internet/oral-citations-to-be-part-of-wikipedia-entries/articleshow/9728638.cms?from=mdr

Geographic content bias in Wikipedia: https://www.vox.com/2014/9/14/6140145/wikipedia-geography

[Graham et al. 2014] Mark Graham, Bernie Hogan, Ralph K. Straumann, and Ahmed Medhat (2014). Uneven geographies of user-generated information: Patterns of increasing informational poverty. Annals of the Association of American Geographers, 104(4), 746–764.

--

--