How To Prevent Offensive Images From Appearing in Your Social Platform

Published in

Community Sift

6 min readSep 30, 2016

If you manage a social platform like an Instagram or a Tumblr, you’ll inevitably face the task of having to remove offensive UGC (user-generated content) from your website, game, or app.

At first, this is simple, with only the occasional inappropriate image or three to remove. Since it seems like such a small issue, you just delete the offending images as needed. However, as your user base grows, so does the % of users who refuse to adhere to your terms of use.

There are some fundamental issues with human moderation:

It’s expensive. It costs much more to review images manually, as each message needs to be reviewed by flawed human eyes.
Moderators get tired and make mistakes. As you throw more pictures at people, they tend to get sick of looking for needles in haystacks and start to get fatigued.
Increased risk. If your platform allows for ‘instant publishing’ without an approval step, then you take on the additional risk of exposing users to offensive images.
Unmanageable backlogs. The more users you have, the more content you’ll receive. If you’re not careful, you can overload your moderators with massive queues full of stuff to review.
Humans aren’t scalable. When you’re throwing human time at the problem, you’re spending human resource dollars on things that aren’t about your future.
Stuck in the past. If you’re spending all of your time moderating, you’re wasting precious time reacting to things rather than building for the future.

At Two Hat, we believe in empowering humans to make purposeful decisions with their time and brain power. We built Community Sift to take care of the crappy stuff so you don’t have to. That’s why we’ve worked with leading professionals and partners to provide a service that automatically assesses and prioritizes user-generated content based on probable risk levels.

Do you want to build and maintain your own anti-virus software and virus signatures?

Here’s the thing — you could go and build some sort of image system in-house to evaluate the risk of incoming UGC. But here’s a question for you: would you create your own anti-virus system just to protect yourself from viruses on your computer? Would you make your own project management system just because you need to manage projects? Or would you build a bug-tracking database system just to track bugs? In the case of anti-virus software, that would be kind of nuts. After all, if you create your own anti-virus software, you’re the first one to get infected with new viruses at they emerge. And humans are clever… they create new viruses all the time. We know because that’s what we deal with every day.

Offensive images are much like viruses. Instead of having to manage your own set of threat signatures, you can just use a third-party service and decrease the scope required to keep those images at bay. By using an automated text and image classification system on your user-generated content, you can protect your users at scale, without the need for an army of human moderators leafing through the content.

Here are some offensive image types we can detect:

Pornography
Graphic Violence
Weapons
Drugs
Custom Topics

Some benefits to an automated threat-prevention system like Community Sift:

Decreased costs. Reduces moderation queues by 90% or more.
Increased efficiency. Prioritized queues for purposeful moderation, sorted by risk
Empowers automation. Instead of pre-moderating or reacting after inappropriate images are published, you can let the system filter or prevent the images from being posted in the first place.
Increased scalability. You can grow your community without worrying about the scope of work required to moderate the content.
Safer than managing it yourself. In the case of Community Sift, we’re assessing images, videos, and text across multiple platforms. You gain a lot from the network effect.
Shape the community you want. You can educate your user base proactively. For example, instead of just accepting inbound pornographic images, you can warn the user that they are uploading content that breaks your terms of use. A warning system is one of the most practical ways to encourage positive user behavior in your app.
Get back to what matters. Instead of trying to tackle this problem, you can focus on building new features and ideas. Let’s face it… that’s the fun stuff, and that’s where you should be spending your time — coming up with new features for the community that’s gathered together because of your platform.

In the latest release to the Community Sift image classification service, the system has been built from the ground up with our partners using machine learning and artificial intelligence. This new incarnation of the image classifier was trained on millions of images to be able to distinguish the difference between a pornographic photo and a picture of a skin-colored donut, for example.

Classifying images can be tricky. In earlier iterations of our image classification service, the system wrongly believed that plain, glazed donuts and fingernails were pornographic since both image types contained a skin tone color. We’ve since fixed this, and the classifier is now running at a 98.14% detection rate and a 0.32% false positive rate for pornography. The remaining 1.86%? Likely blurry images or pictures taken from a distance.

On the image spectrum, some content is so severe it will always be filtered — that’s the 98.14%. Some content you will see again and again, and requires that action be taken on the user, like a ban or suspension — that’s when we factor in user reputation. The more high-risk content they post, the closer we look at their content.

Some images are on the lower end of the severity spectrum. In other words, there is less danger if they appear on the site briefly, are reported, and then removed — that’s the 1.86%.

By combining the image classifier with the text classifier, Community Sift can also catch less-overt pornographic content. Some users may post obscene text within a picture instead of an actual photo, while other users might try to sneak in a picture with an innuendo, but with a very graphic text description.

Keeping on top of incoming user-generated content is a huge amount of work, but it’s absolutely worth the effort. In some of the studies conducted by our Data Science team, we’ve observed that users who engage in social interactions are 3x more likely to continue using your product and less likely to leave your community.

By creating a social platform that allows people to share ideas and information, you have the ability to create connections between people from all around the world.

Community is built through connections from like-minded individuals that bond through shared interests. The relationships between people in a community are strengthened and harder to break when individuals come together through shared beliefs. MMOs like World of Warcraft and Ultima Online mastered the art of gaming communities, resulting in long-term businesses rather than short-term wins.

To learn more about how we help shape healthy online communities, reach out to us anytime. We’d be happy to share more about our vision to create a harassment-free, healthy social web.

How To Prevent Offensive Images From Appearing in Your Social Platform

Do you want to build and maintain your own anti-virus software and virus signatures?

Written by Community Sift