Combating bullying through data and tech

For the last 2 years our team has been busy creating a technology that can help parents with 21st century parenting. The prevalence of technology has shifted much of kids time online, putting many parents in a new and uncomfortable position. Whereas previously parents could see many of the interactions their kids had with friends and peers, now an increasing number of them are occurring online.

Of course the online world has many positives, offering support, information, and entertainment. But it also has its culprits, it seems like a week doesn’t go by when we don’t hear of a tragic incident cause by online activity. Whether its predators, bullying, or thinspiration — kids today are exposed to dangers not only offline, but online. It is these issues that inspired us to create VISR, to notify parents of potential issues — including bullying and mental health concerns. To open the lines of communication between parents and kids, to help them navigate technology in a safe and productive way.

Today we released our dramatically improved alert detection system. This is probably our most significant update to date. While the direct impact of today’s release will be significantly improved bullying detection, there is more to this update. This new system lays the groundwork for self learning and prediction modelling based on historical and population data over time.

How we do it

In line with this important achievement, we thought we’d take a moment to share a bit more about what we’re working on, and some of the efforts that went into the updated bullying identification system.

When building an algorithm that looks at text and other metadata and then gives its best guess about whether a parent should look at it, we need to think about more than just which mathematical techniques will perform best. It is not enough to build a smart algorithm if the algorithm is not looking for the right things in the first place.

In the case of bullying, we needed to find examples of children engaging in bullying, defending bullying victims, supporting bullies (e.g. by liking a hurtful comment), and reporting about bullying events that happened offline. And we needed to find a lot of them. This is not easy. To solve this problem, we put together a team of young adults who know when teenagers are bullying, even when they use language us adults may not understand.

How to train your…computer…to detect bullying.

How to train your…computer

Then a fascinating conversation begins between humans and computers. We provided the computer with a large set of publicly available Twitter, Facebook, Youtube etc. posts that we thought might be examples of bullying. We found a bunch of examples of insulting jokes, angry comments, flame wars, etc., from public sources and trained the computer to look for similar posts in our own user data. The computer found a lot of examples of what it thought was bullying in our own users’ data, which had been thoroughly anonymized. But we did not trust the computer’s judgement at this point, so we asked the team of young adults to confirm the computer’s judgements. At first, the team felt the computer was getting things wrong quite often. So we corrected the computer’s decisions and trained it again. This time, things were a little better, but still not good enough. So we repeated the process. Eventually, we got to a place where when the computer would say a social media post looked like bullying, us humans usually agreed. At that point, we knew we had something really good!

Learning grammar

But there is another part of the story. Sometimes algorithms can learn on their own, but usually they need a helping hand. Ultimately, the algorithms look for evidence of bullying, weigh it together, and then produce a judgement: This is bullying or this is not bullying. Some of the techniques used come from linear algebra, some from probability theory, some from neuroscience, and some from theoretical logic.

Grammar is not a waste of time!

While these can be highly effective, none of them know that BY CAPITALIZING WORDS, PEOPLE ARE EFFECTIVELY SCREAMING AT EACH OTHER. So we had to tell the algorithms to look out for words in all-caps. Similarly, expressions like “you’re an idiot” have completely different meanings when stated negatively, as in “you’re no idiot”. I could have hoped the algorithms could learn this themselves, but why not give them a hand? So we programmed a little English grammar into the system. Our algorithms know that ‘idiot’ and ‘idiots’ have the same root, in the same way ‘go’, ‘going’, and ‘gone’ have the same root. So if we see sentences like ‘way to go idiot’, we also learn something about sentences like ‘good going, idiots’ for free.

This all sounds great, but does it work? Sometimes providing an algorithm with some extra help actually makes things worse. Algorithmic decisions sound good in theory, but are not always so great in practice. This is why we run a huge number of tests.

Test, test, and then test again

We use 10-fold cross-validation on everything. That is, we first take all of our examples of bullying and divide it into two groups, i) a training group and ii) a validation group of examples to determine that our training was successful. We then set the validation group to the side and forget about it until we are done our tests.

We then take the training group and split it ten different ways into new training and testing groups. For each of those ten groups, we train on one part and then test on the other part. The test gives us a number that tells us how well the algorithm performed. We repeat this with all ten training-testing groups and get 10 numbers telling us how well the algorithm performed. We take the average of those 10 numbers, and then have a pretty good idea of the success of the algorithm. Then we compare different algorithms in this same way and find the best one. Finally, we take the very best algorithm and pull out that validation group of examples that we had set to the side when we began. We do one final test on that to be extra sure that we have the very best algorithm possible.

This process gives us many new surprises. For example, we found that when looking at anxious teenagers it helps to know their age but doesn’t help to know their gender. By contrast, it helps to know the gender of a bully but doesn’t help very much to know the age. Why is that? We have no clue! But this is why we perform so many tests. Eventually research like this will lead us to a new breakthrough — which we believe will help us even better understand and help humans stay safer and healthier in the future.

This article was written by David Van Bruwaene, lead NLP scientist at VISR, which provides a preventive wellness app designed to safeguard children online.

Enjoyed learning about how we’re using data science? Please heart the article to help us get the story in front of more people! Thank you :) ❤