Flattening the Other Curve

Sentropy Technologies
Sentropy
Published in
6 min readApr 18, 2020

Relying on our deep expertise in digital communities, we leveraged machine learning and human intelligence to assess new forms of anti-Asian language emerging online since the early days of the pandemic.

Disclaimer: the text below references obscene language pertaining to hate speech.

In recent weeks, it seems like we’ve all had a vocabulary lesson. Phrases like “the hidden enemy” and “flatten the curve” are settling (un)comfortably into our everyday conversations. While we adjust to this new world, we’re also seeing some of the most unseemly parts of our old reality are migrating into the new normal, namely in the form of Coronavirus-themed racism. Like the virus we are fighting, racism is also infectious, often undetected, and dangerous. It forms its own sharp curve — one we want to make sure we can flatten by measuring, identifying, and subduing it.

As we continue to hear unnerving reports of COVID-19-related harassment against people of Asian descent in the physical world, it’s important we also pay attention to the place we’re all spending a majority of our time these days: online. While digital communities have acted as a critical lifeline during this pandemic, our online spaces also have a dark underbelly that can propagate harm when left unchecked. After spending the last several years deeply involved with digital communities, I’ve learned all too well that these spaces can hide aggressive hate in plain sight — hate that can ultimately make the leap from URL to IRL.

Critically, this leap can be attributed to another kind of “patient zero.” Not the one that scientists, doctors, and journalists have been trying to pin down to assuage public fear about the origin of COVID-19. The patient zero, in the case of online abuse, is the user who jumpstarts the outbreak of harmful language rapidly and insidiously. We’ve seen this with racist online anti-Muslim sentiment spiking after 9/11, or more recently, with the Christchurch shooting and its beginnings on 8chan. Now, we find ourselves in yet another moment of uncertainty and crisis, when it seems everyone is looking for a scapegoat and a single user can spread hate through an entire online community.

This is why our team decided to take a closer look at how racist language online is reacting to COVID-19. We’ve found that the web’s vernacular is morphing more rapidly than ever and the hate that’s spread online has already spilled into the physical world.

What we found

Relying on our deep expertise in digital communities, we leveraged machine learning and human intelligence to assess new forms of anti-Asian language emerging online since the early days of the pandemic. Our analysis confirmed a grim but expected hypothesis: new hateful language in these communities was heavily directed at those of Asian (and in many cases, specifically Chinese) descent.

We’re learning more with each passing day, but to date, our research has uncovered and compiled more than 100 variants of abusive language directed towards Asian people and cultures, 85 percent of which are specifically related to COVID-19. That means dozens of new racial slurs have emerged in online forums specifically as a result of COVID-19. We expect this evolution to compound for as long as COVID-19 is garnering front-page media coverage.

To examine language being used across *chans and Reddit, it’s important to remember the history of Anti-Asian hate speech in the U.S., which has traditionally relied on slurs popularized between the 1940s and the 1970s while the U.S. was at war with Japan, Korea, and Vietnam. In recent months, however, we’ve seen a departure from this. Racist content now builds on these older modes or integrates new COVID-19-specific terminology that has never been seen before.

Digging deeper into this new language usage, one feature we identified is the noticeable combination of dehumanizing language, such as “insect,” “bug,” or “roach,” with more “traditional” slurs. Dehumanization isn’t novel when it comes to hate speech, but we were interested to see a high correlation between dehumanizing content and dietary lexicons, a feature we believe is heavily related to unconfirmed theories that the virus originated in a Wuhan “wet-market.” Below, you can find some examples of COVID specific trends in racist language employed in recent months:

The velocity with which these terms are being invented and spread is worrisome. In addition to identifying racist content on the *chan websites, we were also concerned with mapping the rate at which these terms spread across more mainstream websites like Reddit. You can see a portion of our findings below. The first visualization shows how the new COVID-19 related hate slurs have grown across Reddit:

Notice how different the shape of this curve is compared to the very linear shape of the graph below, which shows a very consistent volume of “traditional” anti-Asian slurs during the same time period:

One final graph demonstrates two distinct spikes in anti-Asian sentiment online during specific dates: the first being January 28, and the second, March 18. The initial spike falls within a week of Wuhan’s quarantine, as well as the WHO declaring COVID-19 a global public health emergency. The March date corresponds to the same week the first American cities called for a shelter in place, following the U.S. declaration of a national emergency.

This second spike is particularly interesting. The usage of newly coined terms such as “kung flu” dramatically peaked, here by a factor of 100x. This is likely tied to remarks made by the White House on that same date. Additionally, on March 18, 21% of racist messages we identified contained a new COVID specific term, compared to 1.5% the prior week. With these graphs, we begin to see the jumping off point at which a slur can spread from one user or source (the “patient zero,” if you will) to more widespread, common adoption.

What now?

Hate begins as an issue with a single user. It becomes a problem when we don’t do anything about it. As COVID-19 has demonstrated in recent weeks, it’s not always easy to act, especially when you can’t tell it’s happening right beneath your nose. I believe that’s where we find the beginnings of a solution: we have to be able to identify that first click that leads to a viral, mass spread of hateful information. Like screening for a physical virus, we must invest in tools to help detect hateful behaviors and identify their sources in our online communities.

I’m focusing my energy on working with my colleagues to build solutions to one of the biggest problems facing the internet today — the same internet our children will inherit from us down the road. Even though our company is still in stealth, my colleagues and I felt it important to publish our findings to help make a difference during this crisis given our expertise lies in data and detecting abuse online. I’ll be updating this space as we keep identifying and analyzing data to see how we can be of service to the populations most affected by this issue. Stay with me as we work to flatten the other curve.

--

--

Sentropy Technologies
Sentropy

We all deserve a better internet. Sentropy helps platforms of every size protect their users and their brands from abuse and malicious content.