I use Reddit a lot. In fact, I use it a little too much. My comment score puts me in the top 40% of Reddit users — slightly better than the average Redditor (i.e. someone who has spent too long on the website).
When you do use Reddit a lot, you find these useful bots running on the website, automatically commenting on posts. I think my favourite has got to be this — a bot which converts your ancient black and white photos into colour photos automatically!
Besides bots that can comment just like humans, the Reddit community has some interesting quirks — some phrases become very popular among users. The longer you spend on Reddit, the more lingo that you pick up. For example, cakeday, throw away, alt, risky click, ITT, FTFY, etc. are phrases with specific meanings for hundreds of thousands of anonymous Redditors.
One issue however, with Reddit and other similar websites is that, when you give users the mask of anonymity, there’s a reasonable chance that someone is going to link to porn, or gore.
Now when that happens on Reddit, a user who isn’t sure if they should click on it, might respond with the phrase ‘Risky click’.
Another Reddit user who’s brave enough to risk watching something horrifying, might click on that link, and then warn others away, or reassure others that it was actually harmless.
Outside of Reddit, automated warning systems for risky links exist. Porn filter systems keep your YouTube/Google/Twitter/Facebook/Instagram feeds clean. Yahoo in fact, open-sourced one such system in September 2016.
So why not make a system for Reddit?
And that’s what I set out to do! The phrase ‘risky click’ is used often on Reddit in the context described above. You can set up a system which reads the firehose that is the Reddit comments stream, and then jumps into action when that phrase is found.
So now we are looking out for a phrase. When that phrase is found, the system needs to analyze a comment for links, and figure out if those links are risky to click on (at least when others are around :P).
The links in this context are mostly albums of images, direct links to images, gifs, etc. So at the end of the day, this system can be expressed as a fairly straight-forward problem in Computer Vision.
Given an image or a set of images, tell me if it’s Safe for Work (SFW) or Not (NSFW).
(At the time of writing, the system is able to handle multiple images. Gifs pose a bigger challenge, so I haven’t added functionality for that yet.)
The detection system that Yahoo released, requires a lot of work to set up for personal use. Clarifai, on the other hand, is a company that has a similar system available for free use through API calls.
So, now we have a system to detect risky links — mainly images, and another to analyze risky images.
Put those two together and we now have /u/RiskyClickerBot!
And in case the bot didn’t reply to a risky link automatically, it can be called into action too.
So there you go.
/u/RiskyClickerBot: Making Reddit safer for work, one comment thread at a time.
The code for this bot is open source, although I haven’t yet added a README or a License to the repo. So check it out if you’re interested :)