Cutting Through the Clutter: Why Content Moderation is Hard

Picture this: you wake up on yet another normal day, going about your usual routine until you pick up your phone to browse aimlessly. But what’s this? A barrage of notifications overnight? What happened to warrant 410 emails overnight?

As you start to untangle the mess of missed calls and texts, you seem to realize one thing. It appears that you’ve become the new owner of a top social media platform overnight! Not just any platform at that, but one that has a reputation for being a global public square of opinions and ideas. One that seemed to be so much nicer five years ago but had been steadily declining in quality recently, you thought privately. Well, you have the reins in your hand now, how difficult could it be to run a social media platform?

Famous last words, you realize slowly over the course of the next few weeks and months. But we’re getting ahead of ourselves here a little.

But in this short interruption, you might think: why do this thought exercise at all? I’m not in danger of owning a social media platform any time soon. However, this thought exercise is crucial because most people, including some of the richest people on the planet, underestimate the difficulty and complexity of content moderation. Content moderation is hard! And if one doesn’t acknowledge its magnitude, then platforms can really suffer under policies that foster toxicity and harm.

Back inside the thought experiment, your mind is racing at a million miles an hour. Am I rich? What do I need to learn? What will I change? The one question that seems to come up repeatedly is — what is my vision? Well, you thought that the platform seemed to be unfairly banning some high-quality content you enjoyed. Yet conspiracy theories seemed to fester in corners of it that you’d take care to stay away from. So a common-sense vision then: promoting good content and ensuring the platform isn’t bogged down by garbage and hate. That seems like a good start.

Meetings, meetings, meetings. Your calendar is never free, yet there is still so much left to learn about the company. Employees seem to both fear and disdain (or is it just your imagination?) you — unaware of what this transition means for them. Is the company just inefficient in ways that seem obvious, or is it a consequence of something that you don’t know yet? You can’t seem to decide. Simply put, the week is a blur. But slowly and steadily, you think you start to know what makes the place tick.

Now is the time for action. Equipped with fine teams of the best engineers that money can buy and a platform bankrolled by advertisers vying for coveted spots to target your audience, you decide to start “improving” the policies. Time to implement your common-sense vision — out with the bad and in with the good.

Let’s start with content moderation, your pet peeve from before. Time to actually clamp down on conspiracy theories and “fake news.” The engineers set out with a tightened algorithm to nip this problem in the bud and what’s this? Your profile is bombarded with profiles decrying mutes/bans for innocuous content. And some of this is legitimate, backed with screenshots. At the same time, the media seems to have uncovered communities utilizing dogwhistles and workarounds around your carefully crafted filters.

No problem, you say as you head back to the drawing board. You ask the engineers to simply deploy Artificial Intelligence (AI) and Machine Learning (ML) to help the problem. After all, Chat-GPT seems to be smart enough to make academia shudder or make white-collar workers question if their jobs could be automated away. Surely with the best engineers in your company, they can have something that understands when content is questionable and when it is not, right?

Uh oh, a new story dropped about a person banned because they posted a picture of their child. And more stories about conspiracies still thriving. Stories that make your advertisers nervous. And nervous advertisers mean nervous shareholders. What’s going on?

Your team talks to you about adversarial ML, adding a little bit of noise to an image or video that seems to deceive your models completely while being indiscernibly different to the naked eye. And some rabbit holes have started to camouflage terms. When is an innocent word innocent and when is it really just a veiled reference?

Then it’s time for another meeting, this time with a team of human moderators (but wait, I thought algorithms did all the work. Why do we have humans?) who both describe their job (sifting through mountains of terrible, rule-breaking content) and ask you what your plan is to combat the burnout and employee turnover that their team is facing. This is especially concerning because of the recent increase in hateful content after the transition in leadership.

At the same time, being a graduate of the University of Michigan, it pains you when a report drops citing a paper written by a few researchers at Michigan — apparently, content moderation on social media platforms (including your own platform) seems to disproportionately target certain groups.

But this was the last thing you wanted.

All these fires raging everywhere are so overwhelming. Where should one even begin? Can anything be salvaged? What can you do? Is running the platform more trouble than it is worth? But who will buy it from you amid all the controversy?

Now is a good time to step away from the thought experiment to see why we barrelled right into such a disastrous outcome. The construction exposed that content moderation was probably more complicated than first thought but why is that?

In an increasingly connected world, social media platforms are often public squares of information and conversations. However, while these conversations might occur in hundreds of different languages all around the globe, moderators and algorithms might not cover all of them adequately.

Let’s take a look at the world’s most used social media platform — Facebook. Facebook dealt with about 2.9 billion active users in 2022 and officially supported at least 111 languages. But official support doesn’t include all the languages that discourse can take place in, with 31 languages found in 2019 alone without any support. Not to mention that even when official support is present, it may be extremely lacking. And when this happens, there are real-world consequences that disrupt the lives of many, like the proliferation of content encouraging violence against the Rohingya group in Myanmar.

But maybe that is a Facebook-specific problem of struggling to cope with the scale of the internet’s demands, right? Here is YouTube lagging behind on major de-platforming decisions and struggling against the spread of conspiracies on its platform. Twitter’s fumbles have been widely publicized including in some of the scenarios in the thought experiment. Here is Tiktok’s algorithm leading users down dangerous rabbit holes.

Clearly, this is not a platform-specific problem, even though not all platforms suffer to the same extent.

Then, there are the people powering a lot of the moderation apparatus. These jobs, hidden from the users of the platforms, are thankless, low paying, and take a significant mental toll on the humans that perform this work (going even as far as to induce PTSD in some). Humans have to do this work because algorithms aren’t equipped to make a lot of judgments that moderation requires. Remember the questions from the experiment: when is something innocent and when is it a veiled reference to something that violates that content policy? How does an algorithm pick up on a novel form of misinformation? Unfortunately, algorithms aren’t there yet. Instead somewhat paradoxically, modern algorithms have seemed to intensify the need for identification and labeling, requiring humans to sift through mountains of flagged, potentially rule-breaking content.

This is not to say that software bots and algorithms don’t do a lot of heavy lifting in content moderation. With the introduction of Machine Learning (ML) into the mix, we are moving increasingly into territory where algorithms become “black boxes” where they are fed a set of training inputs and produce outputs with the processing to go from input to output unclear even to the engineers that build them. This means that even efforts to be more transparent about the algorithms that power our social media platforms are hampered by a lack of meaningful knowledge. It also makes it difficult to make small, intentional tweaks since it is not immediately obvious how changes propagate across inputs.

This also opens up another can of worms, adversarial ML, where adding a little bit of noise to images or videos can completely deceive algorithms. This can manifest in innocuous examples like when researchers fooled an image recognition software into thinking a cat’s picture was guacamole. But can also lead to malicious actors using it to bypass the removal of harmful content.

There are also biases inherent in the algorithms’ decision-making influenced by their training sets. Were they trained properly on diverse data sets representative of the user base? Not necessarily, as one researcher found about facial recognition software.

Clearly, there are many moving parts and a lot of challenges when it comes to implementing a content moderation scheme for a large, global social media platform. However, this is not to say that we must simply give up and accept the status quo. Content moderation is hard, yes! But it is not impossible. Like any other complex field, one must defer to experts and commit resources to adequately address shortcomings, whether that be in improving fairness in our algorithms with better datasets, boosting efforts in languages not named English, treating humans who undertake the hapless task of manually sifting through data better, or staying on top of research in the field.

There’s a lot to be done, but if one starts with the right understanding of the challenge, then major gains can be made!

--

--