The Roles Bots Play in Wikipedia
This post summarizes the CSCW 2019 paper “The Roles Bots Play in Wikipedia” by Lei (Nico) Zheng, Christopher M. Albano, Neev M. Vora, Feng Mai, and Jeffrey V. Nickerson.
Do you know that over 10% of the edits on English Wikipedia are being made by bots? Wikipedia bots run various automated tasks such as fixing invalid links, fixing typos, and updating page infobox; they may also calculate statistics and maintain articles in a WikiProject; they may also interact with human editors and each other. At the same time, bots are also contributors to other online communities, including Discord, Github, Slack, and Facebook. The Wikipedia community has a long history of using bots to assist collective knowledge production. The first Wikipedia bot appeared in October 2002 to add and maintain U.S. county and city articles. To this day, there are 1,601 registered bots on the English version of Wikipedia and their edits account for over 10% of all edits. Moreover, in Wikidata, the Wikipedia community’s document database, the proportion of edits made by bots has reached 88%. Clearly, bots are now an integral part of knowledge production.
Bots are tools, but they are different in nature from many tools because of their autonomy. Once deployed, bots quickly amplify human effort in both speed and scale. For example, ClueBot NG, the community’s anti-vandal bot, makes thousands of edits every day and on average reverts a malicious edit within 30 seconds of that edit being made. This significantly reduces the community’s time-to-revert and protects articles from vandalistic edits. However, as shown in previous studies, some automatic tools designed for quality control have inadvertently decreased the retention rate of newcomers. This is because edits from newcomers are sometimes flagged as subpar and therefore are more likely to be reverted. As the use of bots becomes more prevalent, their effects warrant further study.
In a new study that will be presented at CSCW this November, we propose a nine-category taxonomy of bots based on their functions in English Wikipedia. We then build a multi-class classifier to classify 1,601 bots based on their descriptions and editing behavior. We study the activities of the bots with different roles: their edit frequency, their working spaces, and the evolution of their roles. We further investigate the effect of interacting with various bots on the retention rate of newcomers. We find that certain types of bots, even bots within the same role category, can have different effects on newcomers’ retention. Based on this analysis, we suggest ways to improve the communities’ bot governance procedure.
The Roles of bots
We identify nine roles along with their associated bot functions using a combination of network analysis and text mining. The following picture shows a network in which bot functions within the same role category are marked as the same color. Other than well-known functions such as counter-vandalism and article generation, we find other distinct bot functions and roles: for example, the Fixer bot fixes links, content or files in the article pages; the Clerk bot calculates and update statistical information, document user status, and update maintenance pages; the Advisor bot greet the newcomers and provide personalized suggestions for how they can contribute. We find bots with different roles also edit different areas in Wikipedia. The Fixer, Generator, and Connector bots mainly take care of article content pages, while the Tagger, Clerk, and Archiver maintain both the content pages and the community pages. The Advisor, Protector, and Notifier are more user-oriented when compared to their peers.
The Evolution of Bot Roles
We then look at how the bot system in Wikipedia has evolved. The following picture shows the number of active bots and bot edits by role from January 2003 to December 2018. We find that both the number of active bots and the number of bot edits increased between 2003 and 2013. The sharp decline in 2013 is likely caused by a community consensus that led to moving over to Wikidata the old style of inter-language linkages (links that refer to the same article in different language versions of Wikipedia). As a result, dozens of Connector bots that were created to maintain inter-language links decreased their edits and gradually became inactive. After 2013, the number of active bots decreased slowly but the number of bot edits soon climbed back up. This pattern suggests bots are being consolidated in the English Wikipedia. This consolidation could happen both within and across role categories. Within-category consolidation happens when superior or broader-scope bots take over the jobs performed by weaker or narrower-scope ones. For example, over 50% of Protector bots become inactive after the launch of ClueBot NG in 2011. The bot (ClueBot NG) is so fast and productive that it leaves less work for the others that perform similar functions. Cross-category consolidation happens when 1) the bot owners decide to make their bots handle multiple functions; 2) bots take over tasks that were originally performed by now inactive bots.
Roles and the Survival Rates of Wikipedia Newcomers
In addition, we consider the consequences of bot-human interactions. Specifically, we look at how bots serving different roles affect the retention rate of 10,000 randomly sampled newcomers. We find the interaction with Advisor bots having a significant positive effect, whereas the interaction with Protector bots having a significant negative effect. Moreover, we find bots within the same role category could have different effects on newcomers. For example, there are two Advisor bots that interacted with the newcomers in our sample. HostBot interacts with more editors than SuggestBot; the latter has a larger positive effect than the former. Surprisingly, among the three Protector bots, only the interaction with ClueBot NG shows a significant negative effect. The newcomers seem to not care about the bot signing their comments (SineBot) and, counterintuitively, are positively influenced by the bot that reverts their links that violate Wikipedia’s copyright policy (XLinkBot). Different reactions to these Protector bots may be caused by different verbal traits in the messages left by the bots. Compared with ClueBot NG, XLinkBot’s messages are longer, more friendly, and more informative. More generally, this kind of within-category comparison can allow the community to build a better bot governance system that evaluates the impact of individual bot on specific outcomes, in this case, their influence on the survival of newcomers. Bot designers might learn to model characteristics of successful bots while at the same time the overall community might be able to identify and redesign bots that, if deployed widely, might be likely to have negative effects.
Bots are playing an increasingly important role in online communities. Together, humans and bots form an ecosystem, in which they adapt to and learn from each other. Humans develop bots, argue for their approval, and maintain them. They also monitor bots activities, merge similar bots, split complex bots, and turn off malfunctioning bots. Our work is an early step toward understanding the functions and functional categories of bots. The resulting taxonomy can be used to study other issues, including how bots affect human work, how their functionality evolves over time, and how to build a better bot governance system. To catalyze future research, we have open-sourced our code that analyzes Wikipedia bots.
Both this blog post and the paper it describes are collaborative works authored by Lei (Nico) Zheng, Christopher Albano, Neev Vora, Feng Mai, and Jeffrey Nickerson. For more details, please check our full paper. The work will be presented in Austin, Texas at the ACM Conference on Computer-supported Cooperative Work and Social Computing (CSCW’19) in November 2019. The work was supported by the National Science Foundation under grants 1909803, 1717473, 1442840, and 1422066. If you have questions or comments about this study, email the lead author at lzheng9 [at] stevens [dot] edu.
Lei (Nico) Zheng, Christopher M. Albano, Neev M. Vora, Feng Mai, and Jeffrey V. Nickerson. 2019. The Roles Bots Play in Wikipedia. Proceedings of the ACM: Human-Computer Interaction, Volume 3, Issue CSCW, Article 215 (November 2019), 20 pages. https://doi.org/10.1145/3359317