Identifying child predators and toxic behavior using natural language processing
“Don’t talk to strangers.” For children who have heard this 30 years ago, it was easy to see who they knew and who they didn’t. Now with electronic devices aplenty, flexible online communications, and high rates of digital media consumption, the internet has given the powers of access and anonymity to anyone anywhere. A single username could represent multiple people behind screens who are drastically different than they describe. And as opportunities move online, so follow illicit activities. The many children who are victims of these crimes often face lasting mental, social, and physical damages.
It is up to the governments, companies, and communities to actively prevent online toxic behavior from occurring, and 4D Sight hopes to contribute to this movement.
Online risks to children have increased: streaming, gaming, & social platforms facing many reports
- Child sexual abuse material (CSAM) is any sexually explicit material involving a minor
- CSAM distribution has grown drastically in the past few years, with prepubescent children as a majority of victims
- Snapchat, Facebook, Twitter, Twitch, gaming sites are common places for grooming, or manipulation of children to initiate abuse
- Thorn is an organization at the forefront of innovating new products to identify sexual exploitation of children and stop it
In just 2018, over 45 million videos and images representing child sexual abuse material (CSAM), were circulated online, approximately twice the amount in 2017 and over 100,000 times the amount in 2001. CSAM encompasses child pornography and any sexually explicit material involving a person under 18 years of age. A jarring 98% of victims were children 13 years old or younger, as identified by a 2018 Internet Watch Foundation study. With recent COVID-19 conditions making online the new normal, the National Center for Missing & Exploited Children (NCMEC) reported a 106% increase in exploitation findings for the month of March compared to a year ago.
There is no doubt that the issue is a growing concern demanding responsibility from companies whose platforms are used for initiation and distribution of CSAM.
For the streaming and gaming industry, the presence of chat rooms, voice calls, and live-streams have made it extremely easy for predators to initiate contact with young children and direct them to perform inappropriate acts in real time with minimal digital tracing. Denoted as grooming, offenders build trust and desensitize their victims to abuse, often inviting children to meet offline or in other applications where CSAM is then created and circulated. Bullying and aggression are often used in tandem in online communities to threaten victims toward compliance. The UK’s National Society for the Prevention of Cruelty to Children (NSPCC) found that Twitter and Twitch were the second most popular sites for grooming, after Snapchat and Facebook. Ineffective age confirmation and large video sharing capabilities have opened Twitch to toxic behavior, while popular games such as Minecraft, Fortnite, and Roblox had their own incidents of abuse (click aforementioned links for examples). Nevertheless, many more cases are left unfound and perpetrators left unpunished.
To tackle the challenge of identification, Thorn: Digital Defenders of Children, a non-government organization that utilizes tech-led approaches to fight sexual exploitation of children, has taken it upon themselves to create solutions for law enforcement and online platform companies. Their innovations have cut down law enforcement search times by 60% through a growing database of images and videos tagged/finger-printed through strings of numbers called hashes.
Inspired by Thorn’s efforts, 4D Sight recognizes the need to act and to create technologies to uncover offenders and keep children safe, especially when dealing with the streaming/gaming ecosystem.
Steps toward prevention: 4D Sight identifying toxic behavior in chat logs using NLP
Technology companies have a social, moral, and personal responsibility to ensure that the millions of children who come across their platforms are protected from perpetrators lurking online. Deep advertising company 4D Sight holds this to heart as we develop new products, offering both safety and functionality.
Beyond providing advertising that is blended into streaming and gaming environments, 4D Sight is also producing chat room analysis tools with natural language processing (NLP) that provide insights about audience sentiment and engagement. The ability to analyze audience and streamer interactions also enables identification of toxic behavior and easier mitigation of threats. Those who employ 4D Sight’s technology, can actively enforce child safety in daily operations, creating another protective barrier for children.
Currently, we are designing an NLP tool that flags inappropriate behavior characteristic of grooming and distributing CSAM. Our preliminary analysis to gather data is based on Twitch chat logs and can be broken down into 3 main steps:
- Search for key phrases involving commands to move into private chats, as well as queries of personal details such as age, gender, and location
- Follow-up on the sender’s username, identifying successive messages and content regarding inappropriate actions in other streams or chat rooms
- Identify the sender’s and receiver’s backgrounds via public history to confirm or deny predatory intent
After creating a database of common phrases and interactions flagged as grooming, we then can use an NLP model to predict whether an online interaction has a high probability of being or leading to toxic behavior and notify the appropriate authorities. Although internet identities are difficult to track and even harder to confirm due to erasable and encoded methods of communication, we hope that by providing an evaluative tool, online communities will be made safer for users of all ages.
Beyond current efforts, it is essential that we all recognize that this is an issue that everyone must take part in resolving.
Resources and next steps: what everyone can do to help
Global Strategic Response to Online Child Sexual Exploitation, WePROTECT
Broken down into sectors, this infographic displays actions and directions that people from every background could do to contribute. WePROTECT also has more information on child safety during COVID-19 conditions and relevant resources.
National Center for Missing and Exploited Children, NCMEC
Learn more about child safety resources. The CyberTipline allows for public and electronic service providers to report suspected sexual exploitation.
Further helpful material:
- Child Sexual Abuse Material: Model Legislation & Global Review, 2018
- Sextortion Infographic, Thorn
- How we can eliminate child sexual abuse material from the internet, Julie Cordua, Thorn
- Captured on Film: Survivors of child sex abuse material are stuck in a unique cycle of trauma
- Trends in Online Child Sexual Exploitation, 2018