Dextro’s computer vision API recognizing crowds and a police van in Twitter video

Why We Added Computer Vision to Our Editorial Stack

Over the past year, some of the biggest news stories on Mic began when a non-journalist took a video with her phone. Mobile cameras have captured on-the-ground action ranging from the politically interesting to the utterly outrageous. Organizations like the ACLU are designing apps to ensure that important user-generated video gets the attention it deserves. As billions of people come to own smartphones, the amount of newsworthy video shot by non-journalists will explode.

This trend is huge for Mic. Our mission is to enable the connected generation to access and understand newsworthy information. The Internet and Moore’s Law have democratized the capture this kind of information. Today, anyone with a smartphone can be a stringer. Our job as journalists is to spread daylight to dark places, and the ubiquity of smartphones has dramatically expanded our means of doing this.

But a technical challenge for journalists is the sheer amount of user-generated video online. 400 hours of video is uploaded every minute to YouTube alone. It would take 66 years to watch a day of YouTube uploads. Filtering by trending keywords shrinks this pool by an order of magnitude or two, but still leaves reporters with an unmanageable amount of footage. Also, using keyword filters presumes that hundreds of millions of users are all tagging posts in the one standard way you expect.

Another way you could isolate newsworthy video is by looking only at videos with high social engagement. For example, you could use third-party tools to identify the Twitter videos that have been retweeted most over the past few hours. The problem with this method is that by the time it surfaces a video, that video has already been widely covered by the media. This is why you’ll often see the same clip covered by dozens of outlets.

In partnership with Dextro, a computer vision startup, we’ve automated the first level of the video sourcing process. Dextro’s new Sight, Sound, and Motion (SSM) platform analyzes a clip’s video and audio tracks to discern its relevance to a given topic. Dextro can identify places, things, spoken names, and activities of interest in a clip. And it can do this for a functionally unlimited number of clips per hour.

Mic + Dextro

We’ve developed a custom dashboard on top of Dextro’s SSM API, which ingests video from Twitter. The dashboard shows our editors a few things:

  1. A stream of SSM-surfaced videos in general content areas like politics, to see the videos their algorithm judges to be compelling. This eliminates a huge percentage of video noise automatically, reducing the deluge of social video to a timely trickle for editors to scan, as they would Tweetdeck.
  2. A search feature that allows editors to see what video exists for a breaking or evergreen topic they consider newsworthy.
  3. Videos streams specific to trending topics. We tap the Facebook Trends API for topics with the highest mention velocity, then pass those topics to Dextro, which scans Twitter for relevant video. This allows editors to identify original video that’s relevant to conversations that people are already having, to supplement the conversation with new, primary-source information.

The output in any of these three use-cases is a stream of recent, relevant, and compelling clips shot and uploaded by Twitter users. Last week we saw primary source footage from the terrible Mecca stampede right after it happened. Over the weekend we saw footage from the various locations the Pope was to meet. And on Sunday we saw just how bad people’s #SuperBloodMoon shots could be:

We gave this tool to a handful of editors last week for beta-testing. We’re letting them figure out the best use-cases. One application has been to scripted political campaign events, where attendees’ cameras can provide perspectives alternative to the boring, pre-approved CNN or MSNBC feed.

The Internet’s democratization of information access coupled with advances in information processing technologies like computer vision will be huge for journalism. Our partnership with Dextro is the first, small experiment in building tools to help journalists capitalize on this trend, which itself is just beginning.