WIKIPEDIA: GENDER BIAS

According to Wikipedia

Are We Boosting Wikipedia’s Biases with ChatGPT?

OpenSexism
4 min readJul 20, 2023
According to Wikipedia Are We Boosting Wikipedia’s Biases with ChatGPT? Image of unbalanced scale beside a stack of books
According to Wikipedia. Are We Boosting Wikipedia’s Biases with ChatGPT?

For the past two years, I’ve tracked progress — or rather, the absence of progress — towards correcting Wikipedia’s structural gender biases. The dearth of links to women’s biographies throughout the site is a known problem, documented most recently by Sandra González-Bailón and Isabelle Langrock in their excellent paper, which won a Research Award from the Wikimedia Foundation earlier this year. They note:

“Inequalities within the structural properties of Wikipedia — the infobox and the hyperlink network — can have profound effects beyond the platform.” Gendered inequities “can have large effects for information-seeking behavior across a range of digital platforms and devices.”

Profound effects beyond the platform. And yet, progress to remedy this bias has been slow to nonexistent. So, when the Wikimedia Foundation churned out a ChatGPT plug-in, in what appears to be a matter of months, I was stunned. The resources for developing and deploying experimental solutions to perceived problems, as well as the muscle to publicize them, is in place; it’s simply reserved for potentially spreading and amplifying Wikipedia’s biases rather than fixing them.

Recent studies have revealed many contexts in which ChatGPT has amplified bias — whether it is generating sexist and racist performance reviews, portraying humans with racial stereoytpes, generating sexist song lyrics, or producing medical diagnoses that stereotype certain races, ethnicities, and gender identities. OpenAI, the company behind the technology has also been criticized for its lack of transparency, a quality that seems not to have dampened Wikimedia’s enthusiasm for moving forward with the plug-in project.

Here’s what the Wikimedia Foundation said about the plug-in just last week:

“We have developed an experimental Wikipedia plugin that allows ChatGPT to search for and summarize the most up-to-date information on Wikipedia in answer to general knowledge queries. Importantly, the plugin also allows us to specify the attribution that should be included for this content and provide links to Wikipedia articles for further reading.”

The Future Audiences annual plan reveals that the current plug-in functionality is just a start. Ideas for future features include returning “a list of relevant sources from the article,” and exposing “citation links within Wikipedia content.” I did not see “analyze Wikipedia’s cited sources and return an estimate of gender bias,” but that transparency and functionality is much needed.

Few studies examine the gender biases in Wikipedia citations, though Zheng, et al. recently found that “publications by women are cited less by Wikipedia than expected, and publications by women are less likely to be cited than those by men,” and scholars, such as Samuel Baltz, have noted profound gender imbalances in the citations used in Wikipedia’s political science articles.

If the future plans for Wikipedia’s ChatGPT plug-in include returning links to cited sources, great care must be taken that those sources represent a diversity of thinkers. There is currently no system in place to validate this prerequisite for an ethical system. Monitoring and understanding bias should be a priority in the current experiments as well.

Recent studies indicate that algorithms can play a substantial role in Wikipedia’s content gaps. The current push to experiment with ChatGPT does not take the threat of a negative impact seriously enough.

Read more:

“Unethical tech is rarely the result of one big decision but a series of small ones made over time.” — Dawn Duhaney

Works Cited:

Baltz, Samuel. “Reducing Bias in Wikipedia’s Coverage of Political Scientists.” PS: Political Science & Politics 55, no. 2 (2022): 439–444.

Brennan, Kate. “ChatGPT and the Hidden Bias of Language Models.” The Story Exchange (2023).

Cheng, Myra, Esin Durmus, and Dan Jurafsky. “Marked Personas: Using Natural Language Prompts to Measure Stereotypes in Language Models.” arXiv preprint arXiv:2305.18189 (2023).

Gertner, Jon.Wikipedia’s Moment of Truth.” The New York Times (2023).

Houtti, Mo, Isaac Johnson, and Loren Terveen. “Leveraging Recommender Systems to Reduce Content Gaps on Peer Production Platforms.” arXiv preprint arXiv:2307.08669 (2023).

Langrock, Isabelle, and Sandra González-Bailón. “The Gender Divide in Wikipedia: Quantifying and Assessing the Impact of Two Feminist Interventions.” Journal of Communication 72, no. 3 (2022): 297–321.

Mc Gowran, Leigh. “OpenAI criticised for lack of transparency around GPT-4.” Silicone Republic. (2023).

Pinchuk, Maryana. “Exploring paths for the future of free knowledge: New Wikipedia ChatGPT plugin, leveraging rich media social apps, and other experiments.” diff.wikimedia.org. (2023)

Snyder, Kieran. We asked ChatGPT to write performance reviews and they are wildly sexist (and racist).” Fast Company (2023).

“Wikimedia Foundation Annual Plan/2023–2024/Draft/Future Audiences.” https://meta.wikimedia.org/wiki/Wikimedia_Foundation_Annual_Plan/2023-2024/Draft/Future_Audiences#FA2.2_Conversational_AI.

Zack, Travis, Eric Lehman, Mirac Suzgun, Jorge A. Rodriguez, Leo Anthony Celi, Judy Gichoya, Dan Jurafsky et al. “Coding Inequity: Assessing GPT-4’s Potential for Perpetuating Racial and Gender Biases in Healthcare.” medRxiv (2023): 2023–07.

Zheng, Xiang, Jiajing Chen, Erjia Yan, and Chaoqun Ni. “Gender and country biases in Wikipedia citations to scholarly publications.” Journal of the Association for Information Science and Technology 74, no. 2 (2023): 219–233.

--

--