Project Artemis: An Overview

Jonah H. Harris
4 min readJan 30, 2020

--

From online forums to social media, child predators are more prevalent than ever before and use every technology available to them in an attempt to find potential victims. Preventing child exploitation should be of paramount importance to any organization that permits user-generated content on its platform. Recognizing the need for collaboration around online safety, many well-respected technology companies came together to form a consortium called Project Artemis. The culmination of this multi-year, multi-company effort was a multi-faceted text-based analysis system explicitly designed to detect child grooming.

While several companies have developed technology systems to detect predatory behavior, almost no information about them has been shared publicly. As a result, instead of working together, development efforts to combat a global problem have been conducted in isolation. To solve this industry-wide problem and build a better all-around system, we must share knowledge, experience, data, and technology.

While The Meet Group does not permit users under the age of 18 on its platforms, teenagers sometimes masquerade as adults to bypass age-based restrictions. We strive to ensure the safety of all our users, including those who may have manipulated the system controls to gain access. To aid us in accomplishing this, we are in the process of supplementing our current systems with Artemis.

How does Artemis work? At its core, it is an artificial intelligence system that relies upon a database of common textual patterns derived from confirmed child predator conversations. Additionally, it includes a risk detection engine, which is designed to assess the degree of risk a conversation poses based on content. The flexible design allows companies to incorporate their custom detection methods into it — making this system optimal for multiple industries to use.

For each conversation between users, Artemis applies a series of detectors to analyze the content and identify specific behavior. The information returned by Artemis’ various detectors is then combined to produce an overall risk identification score. Using the risk identification score, a company can determine the likelihood of any given conversation ranking highly for known grooming behavior. Based on that likelihood, companies may perform an additional manual review or take automated action.

There are several advantages to Artemis’ multi-faceted approach. First, the database of common textual patterns employed by child predators was compiled using data from multiple companies and platforms. Second, unlike most text processing engines, Artemis is capable of analyzing statements over multiple turns of a conversation. Lastly, the modular architecture of Artemis permits adaptation and expansion based on the needs of different platforms, which allows platforms like The Meet Group to add our own features and functionality to it.

As a distributed and inter-company agile development project, the first release of Artemis focused on solidifying fundamental methods and techniques. Once those requirements were satisfied, The Meet Group added real-time processing functionality — ensuring analyzation of messages within seconds of being sent. Additionally, we included support for profile metadata as well as two natural language processing algorithms we’ve used in the past based on latent data analysis: one to detect distinct speech patterns and another to detect mirroring behavior. These additions improve Artemis not only for The Meet Group but for all companies who use it.

As Artemis doesn’t have access to the account-related details of a user by default, it attempts to determine them from the text. Our inclusion of profile metadata represents additive information used to aid Artemis’ core risk detection algorithm in making a more accurate assessment. Likewise, by incorporating this additional data, we have enabled Artemis to determine when a user has lied about his or her age — a technique Meet Group developed prior to Artemis.

Our custom detectors also analyze latent data, which consists of identifiable hidden attributes created unconsciously by the author. For example, adults who pretend to be children often use similar emojis and slang as children, but the psychological processes behind their writing often produce text that is statistically different. This is because emojis and slang are constantly evolving and are not a natural form of writing for individuals of different ages. Using a similar method, we can also detect when one user is mirroring the behavior of another user, which is a method predators use as a rapport-building technique.

Over the years, The Meet Group has dedicated significant resources toward the detection and identification of child predators. While we have made significant progress, technologies are continually evolving, and we strive constantly to stay ahead of the problem. Given the collaboration behind Artemis, we believe it is an essential component of any social-oriented company’s predator detection toolkit.

Keeping children safe is a responsibility that we take very seriously. Preventing child exploitation has always been important to us and we are honored to have participated in this project. We commend Microsoft on its openness in broadly sharing this capability across the industry to protect children everywhere and we look forward to continuing our work with them on this project.

--

--