Crying Wolf: Moderation System Abuses

Published in

Spirit AI

5 min readJul 25, 2019

In this digital age, online communities are thriving. People interact on social media, phone apps, online games, and countless other online environments.

These social spaces can be very freeing for the people who participate in them, particularly ones with anonymity. However, the lack of social and legal repercussions in these spaces often fosters a darker side in some of these interactions — leading to abuse, harassment, threats, and more.

The main line of defense against this toxicity is moderation systems. They exist to protect participants in the space from unwanted behavior from others, but these systems can be twisted and abused to be used as tools of harassment if they are not regulated.

Moderation system layers each have varying amounts of oversight, because most online communities are far too large for human moderators to read every message and interaction on them. Each of these layers as their own flaws and potential for abuses. Being aware of these weakness can help though the moderation needs and potential moderation abuses vary from space to space.

Personal moderation systems are a set of tools and powers presented to users for them to curate their own online experience. Personal moderation is the fastest acting layer of moderation, and the least subject to abuse as the user is defining their own personal experience instead of enforcing it upon others.

Personal moderation has no negative repercussions in most online communities, but can be abused in competitive game environments. For example, blocking can be abused in competitive games with random or ranked matchmaking. A user can block opponents they do not wish to compete against in hopes of gaining an advantage for achieving higher ranks.

Furthermore, when a user blocks a significant number of other players, they may end up in a situation where they are unable to find a match which does not have a player they have blocked in it already, resulting in infinite queue times. Thus, if blocking is implemented in a competitive game with matchmaking, the development team should consider allowing a blocked user to match and compete against the user who blocked them, while having other limitations automatically applied to them (e.g. mute) to prevent abuse or conflict.

While personal moderation empowers users, it does not prevent harassers from moving on to their next target and continue to spread toxicity throughout the community.

Specifically observing and managing the community and people within it by outlining and enforcing rules— community moderation — helps prevent the internal spread of toxic behavior.

Moderation of a community can be managed actively through representation with active moderators within the community space itself or more passively through actions made in response to tickets. There are three systems that can provide community moderation: company hired moderators, outstanding members of the community who have been granted moderation powers, and a system that automatically reacts to complaints from the community, allowing for community self moderation.

Moderators hired directly by the developer can be thorough and unbiased in their moderation of the community. On the other hand, designated community moderators are members of the community that have been granted moderation powers and are much more cost effective, especially for small studios. Both of these types of moderators help control the tone of a virtual space and can be instructed to uphold specific values for the community.

Moderators that are not given compensation have less accountability than paid moderators and therefore are more susceptible to abuse their powers. They also may have their access restricted to private communications based on privacy regulations.

While both types of moderators can review reports from the community, these reports are not all inclusive of community behavior; at Spirit AI our Ally system detected one chronic abuser was only being reported in 2.5% of his attacks.

Good community moderators will work tirelessly to foster a positive community and can be a huge boon to the community culture of the project, but unscrupulous ones can cause chaos and hurt the image of the project and company. A single corrupt community moderator can create a significant amount of damage, so oversight with a tracking or review process helps prevent and mitigate against abuses by moderators themselves.

Communities can also regulate themselves through self-moderation. While self-moderation can greatly decrease the overall resource costs of moderation, it must be watched carefully for misuse and abuse.

The greatest risk of community moderation is abuse by users. In competitive games, the best performing players are likely to be reported by their opponents without proper reason either maliciously or even as an expression of frustration. Similarly, easily understood community self-moderation systems can be used by malicious groups of users to harass a targeted user. If four reports in under 24 hours results in a ban, then four users can join together to ban anyone of their choosing.

Thus, when implementing a community self-moderation system, it is important to put in checks for frequency of reported compared to playtime and other checks to prevent abuse of the system.

All of these potential abuses need to be considered when developing layers of moderation systems. Weakness of some layers can be covered by strengths of others, but oversight and prevention of moderation abuse is key to maintaining a confident and thriving community.

Crying Wolf: Moderation System Abuses

Written by Renee Gittins