Hi, bot!

Uli Köppen
Feb 8 · 5 min read

What’s up with news automation?

Well, a lot. As part of my Nieman year I’m looking into automating the news. The Computation and Journalism Conference at the University of Miami last weekend was a great opportunity to meet people both from the industry and academia dealing with the challenge of automation.

Why automating?

Let’s get this right out of the way: Automation doesn’t really cut costs in the beginning as you need new skill sets and people dedicated to design, build and maintain automated tasks. But if these efforts are supported by an overall digital strategy for the company — which is REALLY important — it definitely helps news companies drive digital disruption both on the content and on the product side. And for some companies it might even be part of a new revenue model.

Lisa Gibbs on the output of the AP and how automation helps the news agency to scale its products

What can Bots do for Journalism?

As Lisa Gibbs from AP puts it: “Automation frees journalists for real journalism.” She talks about the “augmented journalist” who uses automated products to be able to break news faster, better informed and who leaves repetitive tasks to algorithms.

Tag it, Baby!
I’ll give right away what I’m most excited about: Structured Journalism is back — and more powerful than ever before. No one calls it like that but a lot of the ideas discussed right now are linked to the core of structured journalism: breaking up ongoing news stories in understandable bits, tagging stories while producing them and use those tags in the future for new formats and to better link archive and news.

I love the idea of using methods of structured journalism to bring the archive closer to the newsroom and to re-publish content that gains a second life. Text to speech automation links this idea to smart speakers that enables audio-hyperlinks in the human-machine interaction: Just imagine you can ask Alexa in case you haven’t understood a certain detail or want more context — and Alexa has the answers thanks to a smart tagging system of a powerful media archive in the background. Lots of people work from different directions on solutions to connect archives to newsrooms like the UC Berkeley research project “newsLens”, an algorithmic threading tool that gives users a quick overview over a variety of stories of a bigger news event.

Scaling up

At the moment a huge slice of the automation cake concerns text automation: News agencies like AP or Bloomberg automate earnings and sports stories at a large scale. Some work with customized templates, some are fully automated with natural language generation. End of 2019 AP will automate 40,000 stories per year.

Titus Plattner (3rd from left) works on personalize and customize automated news stories

Titus Plattner has developed during his JFK Fellowship in Stanford the idea to automate customized hyperlocal news stories. Back in Switzerland at the Media Group Tamedia he developed with a commercial natural language generator a textbot that produced 40,000 geographically customized news stories about voting results for each of the 2222 Swiss municipalities.

Algorithms for Verification
Verification still doesn’t work without human intervention but some fact-checking steps can be automated. A lot of AI tools emerge around verification saving reporters precious time verifying user generated content or live commenting on speeches.

Fascinating is the possibility to automatically detect claims in texts. Imagine large amounts of texts or a live speech where fact-checkers don’t have to sift through tons of material to get to the claims they want to check. It’s still a challenge, but it works: A team of scientists at the University of Texas at Arlington persues a project for modeling factual claims. The ClaimBuster algorithm extracts claims from texts or automatically transcripted speeches. And what is best: included is a match between a new factual claim with former fact-checks in a database. So the fact-checkers not only get the extracted claims but also the matches with existing fact checks.

The Duke Reporters’ Lab at Duke University experiments with fully automated real-time fact-checking and also uses the ClaimBuster algorithm. A few days ago the lab tested its products during Trump’s State of the Union speech which gave a feeling on how good automated claim recognition and matching with existing fact-checks already works — they wrote about their experiences here.

AP is developing a verification tool mostly for user generated content with a Google Grant (Lisa Gibbs talked about it at the CplusJ Symposium) that is still in a beta version. The software breaks down videos frame by frame and analyzes a potential manipulation of the image. It also looks for the source material by matching the frames with original sources in the web and shows the path the material has taken so far in the web. As output the algorithm categorizes the URL the reporter pastes in the tool either as god or as bad matches.

Next Level Data Journalism
We’ve been talking a lot about the product side of automation. There’s also a win on the content side — and that’s another thing I’m especially excited about as part of BR Data, our investigative data journalism team for BR/German Public Broadcasting. We’re using algorithmic research methods to investigate algorithms like the Schufa credit rating score, an investigation we did together with our colleagues from Der Spiegel and BR Recherche. Or our large scale investigation of the German housing rental market where we found a clear discrimination of people with a foreign sounding name.

The growing possibilities of AI and automation for journalism lead to more interdisciplinary teams exploring those technologies for journalistic investigations. Julia Angwin and Jeff Larson, two former ProPublica journalists, are about to launch The Markup, a new team investigating the impact of technology on society with a major donation by Craig Newmark, founder of Craigslist. Quartz is building an AI Studio, invested in AI research methods for journalists. Of course ProPublica has a lead role in algorithmic accountability reporting and some other data teams in different countries work to take data journalism to the next level.

What else to read?

I recommend the papers of the Computation and Journalism Symposium with all the videos. Nick Diakopoulos is offering a free mooc on the impact of automation and AI on journalism at the Knight Center for Journalism in the Americas and will publish his book Automating the News in June. And this is a good summary of some of the presented projects during the conference.

As part of my Nieman year I’m also looking into interdisciplinary newsroom management and I’ll publish my findings here. Looking forward to your comments and an interesting discussion down there!

Uli Köppen

Written by

Journalism #Nieman Fellow ’19: Studying Algorithmic Bias and Automation at #Harvard and #MIT