Content moderators asked for better tools. We listened.

Published in

Spectrum Labs

4 min readJan 14, 2020

Guardian gives trust & safety professionals the ability to moderate with confidence.

Spectrum Labs was founded in 2016; the year matters, because we had seen how misinformation influenced the US presidential election and wanted to protect online communities from complex forms of toxicity — like radicalization, cyberbullying, grooming, human trafficking, and more.

My co-founder and CTO Josh Newman’s background in the military (he was a Sergeant in the United States Marine Corps), his time as a researcher with the James Martin Center for Nonproliferation Studies, and his time as a Senior Architect at Salesforce gave him a unique perspective on the problem — and possible solutions. My background in data collection and product management gave us the ability to take our ideas from concept to reality.

Together, with the help of some very talented teammates, we built a truly amazing piece of technology: an astonishingly accurate Contextual AI system that identifies toxic behaviors like hate speech, radicalization, threats, and other ugly behaviors that drive users away from online communities.

The market validated our idea; our solution is live and already producing great results at a several of the largest social media companies in the world and with one of the largest online dating companies in the world (among others).

But, knowing what it takes to maintain community health, identifying disruptive behaviors is only half the equation.

The other half of the equation is still human judgment. Trust & Safety leaders need to be able to respond — at scale — to threats to their community while maintaining a close, personal connection to those communities; in short, staying in the loop.

At the same time, not all Trust & Safety people think in datasets or code. We wanted to make it easy for non-technical people to understand what is happening on their platforms — any time, in real-time… and give them a way to weigh in.

INTRODUCING GUARDIAN

That’s why we are excited to introduce Guardian. We built Guardian to give Trust & Safety leaders the power to identify and easily respond to alarming behaviors.

Simply put, Guardian is a dead-simple user interface that sits on top of our behavior identification system. It gives Trust & Safety leaders the ability to:

Digitize their guidelines
Develop automated responses for certain behaviors
Deliver more sensitive incidents to a team for review; and finally,
See how healthy their community is against industry benchmarks and their own KPIs.

WHY THIS MATTERS

Who cares if there is toxic behavior in an online community? Why bother enforcing guidelines? How does it impact revenue & growth?

Unbelievably, we get asked those questions a lot, by executives who prioritize growth over health (and don’t see how health leads to growth).

It matters because toxic behaviors online harm people in real life. Victims of cyberbullying on social media commit suicide. People radicalized within video game forums commit terrorist acts. Children are coerced into producing sexual materials on social platforms. Every time a platform doesn’t respond, or responds inadequately, the perpetrators grow bolder and the number of victims grows.

For the growth-before-health executives reading this, allow me to connect the dots: people leave your platform when they experience toxicity. All that effort on growth? Wasted.

Guardian gives Trust & Safety teams at online communities of all shapes and sizes the power to protect their communities and keep them — you guessed it — growing.

SO…WHAT IS IT, EXACTLY?

A Powerful Automation Builder

Say you decide, in reaction to recent increases in anti-semitic behaviors, to take proactive steps to prevent Hate Speech in your online community. First, we’d customize our Hate Speech model for your community (yes, there are ‘official’ definitions, but some online communities decide to broaden them). Then, you’d use Guardian’s Automation Builder to automate what happens when the model identifies Hate Speech — perhaps you’d want new users banned for n days, established users with lower reputation scores banned from the platform permanently and established users with higher reputations scores sent to your Moderation Team for personal responses. Guardian allows for nuance in responses.

A Prioritized Moderation Queue

Hate Speech incidents appear in Guardian’s Moderation Queue, prioritized in an order customized by you. Incidents are reported with contextual information, so your Moderation Team doesn’t have to dig around for what actually happened.

An Actionable Analytics Dashboard

When you set out to prevent Hate Speech in your online community, you would have worked with us to develop key performance indicators to measure your efforts. You’d use Guardian’s Analytics Dashboard to track those KPIs. The Analytics Dashboard would also display trends you can take action from. For example, you may see that incidents tend to spike around certain times and that may inspire you to proactively communicate to your community in advance of those times.

WRAPPING IT UP

Guardian answers an unmet need for Trust & Safety leaders. Prior to Guardian, these folks were relying on keyword-based detection approaches that overfired so frequently as to be useless, or failed to fire (just as bad). They couldn’t carry out nuanced responses to incidents. They couldn’t consistently address incidents. And, perhaps most importantly, keywords and looking at a limited amount of content in a single message, can’t identify the patterns that build over time. I mean, predators don’t introduce themselves as pedophiles, but follow a script of gaining trust and isolating their prey. Keywords can’t find that. We can.

With Guardian, Trust & Safety workers finally have the visibility, confidence and power they need to improve & maintain community health.

Interested in a test drive? DM me on Twitter: @_JustinDavis.

Content moderators asked for better tools. We listened.

Written by Justin Davis