Instagram’s content moderation co-opted to make misogynistic joke

4 min readSep 15, 2020

Imagine coming across a post on Instagram that simply says “women are funny” and then seeing that Instagram had declared it false information. Questions immediately come to mind; is Instagram being sexist? Who’s responsible for this? Was it a rogue Instagram employee? Or was it an algorithm?

When it comes to content moderation, we often hear about decisions made about highly publicized issues such as fact-checking of the president’s social media accounts, or fending off misinformation about conspiracy theories and public health crises. But many don’t realize that the challenges of content moderation reach far beyond what’s reported in the news. There are seemingly endless ways that Internet services can be abused, sometimes with content moderation systems being gamed in the process.

While this example is a relatively low stakes situation, it serves as a case study among many that illustrates why content moderation at scale is just about impossible to do well. The likely critical failure here, as in many cases, is the failure to recognize context when making moderation decisions.

The setup

In December 2019, while browsing through the stories in my Instagram feed I saw that someone had shared a post from a meme account. It grabbed my attention because Instagram blurred it out through the use of an interstitial, a tool that requires users to click through to view the content. It appeared like in the screenshot below:

After clicking through to see the post this photo is revealed:

The “fact-checked” post is a photo of a breaching great white shark with superimposed text that says “women are funny.” In addition to the interstitial, Instagram appended a message at the bottom that reads “See why fact-checkers say this is false.”

How is it that a photo of a shark that says women are funny could be labeled as false information by Instagram?

What’s going on?

Clicking on the appended message we find that Instagram did not intend to fact-check the claim that women are funny, but instead, challenge the claim that the photo of the shark had won an award from National Geographic.

The message says: No, this is not National Geographic’s ‘Picture of the Year’

Only after reading the supplemental information did I get the full picture. In December 2016, a tweet claiming the photo had won an award from National Geographic gained traction.

See the tweet below:

The claim was debunked when it was revealed that nothing about the tweet was true. The award doesn’t exist, no one named Bob Burton works at NatGeo, and the photo is not a photo at all, but the work of a computer graphics artist.

But then why was the Instagram post labeled as false information if it didn’t claim the image had won an award from NatGeo?

Resolution

What likely happened was the post initially got flagged by an algorithm and then was queued up for a human moderator to take action on. The human moderator then made the wrong decision to label it as false information.

There are many factors working against the moderator making the right decision. Facebook (Instagram’s parent company) outsources several thousand workers to sift through flagged content, much of it horrific. Workers, who moderate hundreds of posts per day, have little time to decide a post’s fate in light of frequently changing internal policies. On top of that, much of these outsourced workers are based in places like the Philippines and India, where they are less aware of the cultural context of what they are moderating.

The Instagram moderator may not have understood that it’s the image of the shark in connection to the claim that it won a NatGeo award that deserves the false information label.

The challenges of content moderation at scale are well documented, and this shark tale joins countless others in a sea of content moderation mishaps. Indeed, this case study reflects Instagram’s own challenged content moderation model: to move fast and moderate things. Even if it means moderating the wrong things.

Instagram’s content moderation co-opted to make misogynistic joke

The setup

What’s going on?

Resolution

Written by Alan P. Kyle