Wikipedia: Gender

The Wednesday Index: One Year of Gender Diversity Data Visualized

A Longitudinal Study of Inequity and Inertia in Wikipedia’s Linked Citations

OpenSexism
5 min readNov 22, 2022
Gender Bias in Wikipedia Links: A visualization of link bias in a set of 26 pages over the course of one year. Links to male biographies represent over 90% of total.
Gender diversity of links to biographies in the Wednesday Index, a set of 26 Wikipedia pages monitored over the course of one year.

For the past year, every Wednesday, I’ve used PAC’s Wikidata tool to measure the gender diversity in the biographies linked from a set of 26 Wikipedia pages — from ‘Reality’ to ‘Universe’, ‘Science’ to ‘Justice’ — to get a sense for both the extent of citation bias on Wikipedia and how quickly it changes.

Now, coinciding with the one-year anniversary of this project, PAC developed a handy tool to visualize the data at both the index and page level. Looking at the visualizations, one can see:

  • The very low share of links to biographies other than men’s biographies.
  • The impact David Palfrey had when he chose to spearhead work on correcting link bias in the Wednesday Index’s set of 26 pages in July of 2022 in conjunction with Women In Red.
  • That between July 2022 and now, the overall share of links to women’s biographies has remained higher than it was before Palfrey’s intervention.

To see an example of how Palfrey’s work had a positive and (so far) lasting impact, see “Human Body”, which linked to no women before Palfrey began his work, but has retained the additions.

Gender Bias in Links on Human Body Wikipedia article: A visualization of link bias monitored over the course of one year. Until July, 2022, all links were to male biographies.
Gender Diversity of links to biographies on Human Body Wikipedia article, monitored over the course of one year.

Other page-link interventions were not as stable. Here is the visualization for the Knowledge page, for example

Gender Diversity of links to biographies on Knowledge Wikipedia article, monitored over the course of one year.

Overall, however, the share of links to women’s biographies rose and remained higher after the intervention.

Bump chart showing the evolution of gender share over time for the 26 pages in the Wednesday Index.

Wikipedia’s link bias is a known structural bias, and one that’s particularly concerning as it is amplified elsewhere. As Langrock and Gonzalez-Bailon write “Inequalities within the structural properties of Wikipedia — the infobox and the hyperlink network — can have profound effects beyond the platform.” Gendered inequities “can have large effects for information-seeking behavior across a range of digital platforms and devices.”

Palfrey’s intervention in the Wednesday Index has had a lasting positive impact, but men still command the lion’s share — over 90 percent — of links from this set of pages.

Note that link bias is not limited to the Wednesday Index’s 26 pages. In August of 2022, for example, Palfrey found that links to women’s biographies account for only 7 percent of the links to humans on Wikipedia’s 100 Level 2 Vital articles (which includes pages such as Technology, Astronomy, and Business).

PAC’s tool is one of the few tools that helps measure and make visible citation bias — of which link bias is just one flavor. In a recent study, Zheng et. al. “show that publications by women are cited less by Wikipedia than expected, and publications by women are less likely to be cited than those by men.” This bias is profound and not one an individual alone can fix, nor should it be.

Wikipedia is a sociotechnological system, and this system can and should evolve to facilitate progress towards equity.

Working Towards Equity

Some ideas for work that can help raise awareness of citation bias and galvanize change at a system-wide level include:

  • A citation-diversity talk-page template, suggested by Palfrey, based on PAC’s gender diversity tool. Such a template can make gender imbalance immediately visible to editors, who can then choose to act on it. A citation diversity template can be particularly effective on high-profile articles, such as featured pages/pages nominated for features. Here, for example, is the gender diversity result for the page featured as I’m writing this, Niandra LaDes and Usually Just a T-Shirt
  • Citation diversity statements. Zurn et. al. present this thoughtful approach to raising awareness and transparently reporting diversity in citation lists. This approach can be adapted, promoted, and used by Wikipedia editors who choose to make a public commitment to diversifying reference lists.
  • Developing well documented and easy-to-use dashboards and tools to monitor and make visible progress towards addressing link and citation bias. Existing dashboards, such as Humaniki, have been an incredibly effective tool for both raising awareness of bias and tracking progress towards addressing it. One idea, for example, is to expand PAC’s tool to generate a report for an arbitrary set of pages or categories, so that edit-a-thon-particapants and other interested users can better see and respond to citation bias. Some examples of tools from outside Wikipedia that help quantify and make visible citation bias include the Journal of Cognitive Neuroscience’s gender citation balance index, which Nature reports uses “software to track how closely gender proportions in reference lists match rates of authorship in the journal” and cleanBib, which analyzes “the predicted gender and race of first and last authors in reference lists of manuscripts in progress.”
  • Public-facing indicators that alert Wikipedia readers to the extent of gender bias in links and citations, as part of — or in addition to — current quality indicators that are under discussion.

Wikipedia biases are amplifying inequity, and inaction is not neutral, rather a choice to perpetuate an unjust status quo. Individuals can have an impact, but to have impact at scale, the systems must support these efforts. Because inequity and inertia should never walk hand in hand.

Works Cited

Kwon, Diana. “The rise of citational justice: how scholars are making references fairer.” Nature 603, no. 7902 (2022): 568–571.

Langrock, Isabelle, and Sandra González-Bailón. “The Gender Divide in Wikipedia: Quantifying and Assessing the Impact of Two Feminist Interventions.” Journal of Communication 72, no. 3 (2022): 297–321.

Zheng, Xiang, Jiajing Chen, Erjia Yan, and Chaoqun Ni. “Gender and country biases in Wikipedia citations to scholarly publications.” Journal of the Association for Information Science and Technology (2022).

Zurn, Perry, Erin G. Teich, Samantha C. Simon, Jason Z. Kim, and Dani S. Bassett. “Supporting academic equity in physics through citation diversity.” Communications Physics 5, no. 1 (2022): 1–5.

--

--