Designing Word Filter Tools for Creator-led Comment Moderation

Shagun Jhaver
5 min readMar 10, 2022

This blog post summarizes a paper on how we designed FilterBuddy, a system for content creators to enhance their moderation capabilities, and the lessons we learned for anti-abuse tool design through that process. This paper will be presented at the 2022 ACM CHI Conference on Human Factors in Computing Systems.

CONTENT WARNING: This blog post contains some offensive keywords.

Content creators on platforms like YouTube, TikTok and Twitch not only have to write, produce, and edit their content, but they must also cultivate and engage with their audiences to distinguish themselves from ever-growing competition. Creators often invest a great deal of time and emotional labor to remove unwanted comments from their viewers and address stalking, harassment and online threats. As channels grow larger and attract more comments, it becomes increasingly difficult for creators to review every comment, and some automation becomes necessary. In light of this, creators have frequently requested more powerful moderation tools and resources that can alleviate the pressures surrounding content moderation.

In this paper, we investigate the design of moderation tools aimed at content creators to understand how they can better address creators’ needs, especially the needs of creators from marginalized groups who face disproportionate and targeted abuse. We focus our needfinding and design exploration around word filter tools, one of the most common moderation tools that allow creators to configure a list of phrases such that comments containing any of those phrases get automatically removed or held for review. Through needfinding interviews with 19 creators, we examine their experiences with receiving unwanted comments and their everyday practices and strategies using existing word filter tools to try to reduce their moderation workload and stress. We found that creators were overall frustrated with the rudimentary features provided by existing moderation tools on platforms such as YouTube (see Figure 1). Creators described having difficulties with both building up a set of useful filters from scratch, as well as organizing a growing list of filters and auditing what their filters were actually catching.

Figure 1: Implementations of word filters on four popular platforms — Facebook, Reddit, Twitch, and YouTube. In each case, the shown interfaces are the only information sources available for viewing and configuring word filters; no accompanying feedback or visualization mechanisms exist to assist configuration or track how the filters are performing.
Figure 2: A screenshot of FilterBuddy’s category page showing (A) a sidebar with links to the overview page, each configured category page, and Add new category page; (B) a chart showing the number of comments caught by each category phrase in the past month; C) a paginated table previewing all comments caught by the category; (D) a table of phrases configured in the category with options to include/exclude spelling variants and determine action on match for each phrase; and (E) a section to add new phrases in the category. Note that we limit the number of table rows we show in all the figures for brevity.

To address these needs, we present FilterBuddy, a system built for YouTube creators that augments word filter tools with features to support better authoring, maintenance, and auditing of word filters. Users can connect the system to their YouTube channel and use it instead of the native word filter tool on YouTube to moderate their comments. Features of FilterBuddy include but are not limited to:

  1. Interactive previews of what different filters would capture during the authoring process (see Fig. 3, 2-E).
  2. The ability to build from existing filters created by others (see Fig. 4).
  3. Organization of filters into spelling variants (Fig. 2-D) and higher-level categories (Fig. 2-A).
  4. Time-series graphs (Fig. 2-B, 5) and tables (Fig. 2-C, 5) to understand which comments are caught by which filters over time.

FilterBuddy is not just a research tool; it is a system designed to be actively used by YouTube creators to enhance their moderation capabilities.

Figure 3: ‘Add New Phrase’ section on the FilterBuddy Category Page. As the user types in a phrase, the comments caught by that phrase are auto-populated. Comments not already caught by any configured phrases have a yellow background so that they are easily distinguished.
Figure 4: FilterBuddy’s ‘Add a New Category’ Page. Users can either create an empty category, import one of the built-in categories, or clone a category shared by another user to quickly set up their configurations. We show here the details of the built-in ‘Homophobia’ category selected in the dropdown.

We conducted an exploratory qualitative user study of FilterBuddy with 8 creators on YouTube who interacted with the tool loaded with their YouTube comments while providing feedback. We found that participants were appreciative of greater automation but show resistance to replacing the rule-based configurations of this tool with solely ML-based approaches and requested additional defensive mechanisms to reduce incorrect removals. They also considered the ability to share their filters and build off of filters created by others to be a powerful means to reduce toxic content across YouTube. In addition, we were surprised to find that participants devised many use cases for the tool that go beyond just moderating undesirable posts, such as finding collaborators and filtering for positive feedback. Overall, participants felt that the designs explored in FilterBuddy would empower content creators, especially those belonging to marginalized groups, to efficiently address their content moderation needs as well as better understand their audiences.

Figure 5: A screenshot of FilterBuddy’s Home Page. It shows (a) a time-series graph for number of caught comments aggregated at the category level and (b) a paginated table of all comments posted on the user’s channel.

Our design exploration shows that creators are deeply invested in retaining control over their moderation operations; we reflect on how sensible design defaults can help creators achieve this goal while minimizing the manual effort of setting up granular filters. We also highlight that trust-sensitive collective governance mechanisms will be required to resolve the tensions creators face between seeking to build on other creators’ word filters while also preferring to keep their configurations private to avoid exploitation by bad actors. Finally, we had limited development resources to implement FilterBuddy — yet, we found that its features are seen as highly desirable by creators. In light of this, we call upon platforms to step up their efforts and investments in developing tools that improve creators’ working conditions.

For more details about our methods, design goals, system design and evaluation, design implications, and what platforms, third-party organizations, advocacy groups, and policymakers can learn from our research, please check out our full paper that will be published in CHI 2022. For questions and comments about the work, please drop an email to Shagun Jhaver at shagun.jhaver [at] rutgers [dot] edu.


Shagun Jhaver, Quanze Chen, Detlef Knauss, and Amy Zhang. 2022. Designing Word Filter Tools for Creator-led Comment Moderation. In CHI Conference on Human Factors in Computing Systems (CHI ’22), April 29-May 5, 2022, New Orleans, LA, USA. ACM, New York, NY, USA, 21 pages.