Cloudera Cares + Thorn = Social Impact at Grace Hopper 2016

Alison Yu
Published in
4 min readOct 28, 2016
Participants and Mentors for the Cloudera Cares + Thorn Open Source Day Project at Grace Hopper 2016

Data Impact award winner Thorn and Cloudera Cares hosted a hackathon at the recent Grace Hopper Open Source Day. Women attending the conference were invited to take part in a day long hackathon to benefit a social impact project. The event served a number of purposes — to draw attention to the insidious problem of child sexual exploitation, look for ways to stop it, encourage women starting out in their tech careers to contribute and offer those same women the opportunity to be mentored by other experienced engineers and data scientists.

“When Cloudera Cares was invited back to Grace Hopper’s Open Source Day, our amazing Cloudera Cares team immediately thought of working with Thorn as our non-profit partner. We seized the opportunity to mentor women early on in their careers while simultaneously helping to end the exploitation of children. One of our goals with Cloudera Cares is to ensure that the technology we produce is also used to make a real difference in the world”, said Britt Sellin, VP of Human Resources and the Executive Sponsor of Cloudera Cares.

The Issue

With the explosion of internet use, people with shared interests have been able to connect more easily. While the vast majority of the time this doesn’t pose a danger to society, the internet also facilitates interactions of a darker nature, including the distribution of child sexual abuse material. In an effort to remain anonymous, these individuals operate in the “dark web”. The nature of the dark web allows producers, distributors, and consumers of child abuse material to keep their identity and location hidden while interacting and sharing content with others. In addition to providing anonymity, hosting sites on the dark web are frequently shift addresses, evading detection. Efforts from law enforcement to identify and rescue child victims has become increasingly difficult as sharing expands and shifts on the dark web.

The project that Thorn and Cloudera Cares focused on at the Grace Hopper Open Source Day hackathon addresses the challenges of the dark web, using data analytics to help sift through large amounts of data from the dark web to track behavior.

Path to Success

During the hackathon, sample chat room data was provided to participants. With the data, codeathon participants were asked to do a deep dive, seeking to identify important and prolific users, understand their traffic patterns, identify sub-communities of users and investigate the spread of content over time by analyzing and visualizing the data.

Not only were the women able to volunteer their time for a great cause, they were also introduced to a bevy of different tools that they may not have not used before, including GitHub, Docker, Jupyter Notebooks, Pandas, NetworkX and other popular tools.

Students were encouraged to “think like an investigator”. If they discovered particularly relevant content, they were asked to follow the trail to figure out which user first posted it and how it spread. Or, if they focused on particular users, they were asked to visualize that user’s chat patterns. The work that the attendees contributed has the potential to inform and be included in tools being developed by Thorn to identify child abuse victims faster.

Sugreev Chawla, Lead Data Scientist at Thorn shared his excitement over the promising results from the hackathon, “The methods students devised to investigate the spread of content will provide a great foundation for my large scale analysis with our full dataset. Some of the students dove into visualization tools that I haven’t used before, so I’m excited to take what they’ve built and apply to other datasets.”

A HUGE round of applause for the insights that the women were able to generate in such a short period of time, and a big thank you to our mentors from University of Buffalo and Microsoft. Thanks for joining us, and we hope to see you next year at Grace Hopper!

Learn more:
>> Learn about Thorn and Cloudera’s collaboration
>> Find out why Thorn won a Data Impact Award

This blog was co-written by the following authors:

Alison Yu (@alisonjudy) is the Social Media Manager at Cloudera. In this role she leads Cloudera’s global social media strategy. Alison also helps lead Cloudera Cares, Cloudera’s philanthropic arm. Prior to joining Cloudera, Alison led the global social media efforts at Informatica and SunPower. She holds a B.A. in Communication from University of California, Davis.

Nikisha Vashee is the Development Associate at Thorn. She supports Thorn in their mission of driving technology innovation to fight child trafficking through cultivation of donors, grants, and corporate partners. Prior to joining Thorn, Nikisha has worked as a non-profit professional responsible for donor, brand, and event management for a diverse group of organizations. She holds a B.S. in Psychobiology from the University of California, Los Angeles.



Alison Yu

Product Marketing at Microsoft. Previously Indeed, Cloudera, Informatica, & SunPower. Thoughts are mine.