Sooner than many predicted (and hoped) journalism is wrestling with the ethics and practicalities of robot-generated news.
By Rebecca Heilweil
Last November, Switzerland held a referendum to decide whether the government should subsidize farmers who don’t dehorn their cows and goats. The measure failed. But for one of the country’s media giants, Tamedia, the vote was a smashing success. The company’s text-automation tool, Tobi, produced nearly 40,000 articles on the referendum results in less than five minutes, using vote data and pre-formulated templates constructed by Tamedia journalists. Written in both German and French, and localized to each of Switzerland’s 2,222 municipalities, Tobi’s work was read by more than 100,000 people.
Tobi’s byline was accompanied by a robot emoji.
But while a bot may have won the initial byline, Tamedia eventually added the journalists who worked on Tobi. There’s also a note on the bottom of Tobi’s pieces that explains how the bot works.
“Transparency about the process is very important,” said Titus Plattner, an innovation project manager and investigative reporter at Tamedia. His team had discussed how best to disclose the use of automation — an evolving process — and how users could give their feedback on the technology. “We are proud about the fact that we state that we use this automatic generation of text, but we also want to make clear to the reader that behind the templates we had experienced reporters.”
Automation has already enabled media outlets to produce massive amounts of content, providing reporters much-needed time for more in-depth journalism. (Tamedia estimates it would have taken a traditional human journalist at least 800 working days to produce the same volume of stories that Tobi did.) But norms surrounding how media organizations should disclose their use of automated content production are still developing. And many of them, including the largest legacy institutions, are eager to use artificial intelligence.
The Associated Press has been producing automated articles based on earnings reports since 2014. In March, the wire service announced that it will be adding hockey games to its automated coverage. The Washington Post started using automation to produce reporting on sports and congressional and gubernatorial elections in 2016.
Meanwhile, tech firms such as Narrative Science and Automated Insights are developing natural text generation platforms that can be used by media outlets. Both Tobi’s vote coverage and the Associated Press’ earnings reports rely on Automated Insights’ Wordsmith, a tool that allows users to construct templates, that, when connected to a data set, automatically generate text.
Newsrooms must introduce readers to these technologies. Expanding media use of automation comes at a time when trust in the media is still recovering from its historic low in 2016, according to Gallup. While existing research on audience reception to automation in journalism has been limited, a few studies indicate that readers are, at the very least, wary. In 2017, University of Southern Mississippi professor Mary Lou Sheffer found that journalism produced by algorithms was unpopular among readers.
“Credibility was a huge issue. It all hinged on lack of the human element,” said Sheffer. Respondents expressed concern that robo-journalism might sacrifice emotion and the human aspects of writing.
In 2016, German researchers found that articles attributed to human authors were seen more positively, regardless of whether the text was actually produced by a human or computer. Yet, when unaware of the true author, respondents believed computer-written texts were more credible, and demonstrated more expertise, than the work of human authors. That result indicates that attribution, and not the quality of the content, could be automated journalism’s bigger hurdle. (In that vein, OpenAI, the nonprofit AI research firm founded by Elon Musk and Sam Altman, took the rare step of not releasing their findings on their AI text-generator because it was so effective that its creators considered it dangerous).
Interestingly, Dutch researchers found, in a survey of Europeans, that readers, when considering just the identity of an author, saw no difference between human, automated, or hybrid sources. When presented with the actual content of articles produced by human, automated, or hybrid sources, readers also perceived no difference in credibility — if the subject was finance. When the subject was sports, readers found automated content to be more credible.
A 2016 study by Israeli researchers Tal Montal and Zvi Reich found that some outlets were fully transparent about their use of automation, which could mean revealing a data source and describing the software vendor used, or explaining that the piece was written by an algorithm and providing developers’ names. However, other outlets were less transparent, sometimes providing no byline, or attributing automated content to the news organization as a whole.
Kelly McBride of the Poynter Institute has said that what’s most important is making clear that an article is produced through automation, and naming who readers can contact. “If you view the byline through the lens of responsibility and accountability, then it’s more important to identify the people who are responsible. It could be the reporter, it could be someone who wrote the software,” says Nicholas Diakopoulos, a communications professor at Northwestern. “I think attributing to the data provider [also] makes a lot of sense.” He adds that the question of the byline gets more complex when human journalists supplement a story with more reporting, or when a local outlet adapts a wire service’s automation-produced copy text.
Sheffer added that readers sometimes use bylines to come to their own interpretation of the authors’ biases. She explained, “If I’m reading something that’s generated by a computer, and I have no idea who the programmer is, I have no idea how the story is slanted.”
These technologies suffer from fewer errors than human reporters, a positive sign for outlets hoping to increase audience trust. According to Plattner, Tobi’s one mistake during the cattle dehorning referendum was caused by a small and easily fixed issue with the templates. The AP has also said that the earnings reports produced through automation make fewer mistakes than those written manually. “When we have seen errors crop up in automated content, it’s often times due to poor quality data. If there’s an error in the data, that’s just going to get translated through the system and end up as an error in the content that’s produced,” said Diakopoulos.
Like Tobi, the AP’s earnings reports aren’t individually checked before publication, though Automated Insights says that editors play an aggressive role in creating, checking and signing-off on the automation’s text architecture before it’s finalized. Alternatively, the L.A. Times’ earthquake-reporting QuakeBot, which uses data from the U.S. Geological Survey, still requires that a human approve the publication of every piece. “The post is drafted and posted to the content management system, but not published. Then a notification is [automatically] sent in Slack to our copy desk, who have their own channel, and that tells them QuakeBot has detected a new earthquake [and to] please go review the post,” explains Ben Welsh, the L.A. Times’ data desk editor. The post is then reviewed, and sometimes edited. But the pieces aren’t always published, since editors don’t find every earthquake QuakeBot reports newsworthy.
For Heliograf, the automated journalism tool used by the Washington Post to produce election and sports content (among other uses), quality control occurs at multiple steps. As with Tobi, editors extensively review the templates, and also test the system with trial data.
The Post uses numeric thresholds to find data that needs a second look. “Did the data provider leave off a decimal point? Did they just accidentally leave off one of the digits in the score? It is unlikely, for example, that the women’s 100 meter sprints champion, is going to do 10 percent better than the Olympic record, or 10 percent worse. That way we can guard against errors in the data by just saying, ‘this is the boundary of normality,’” explains Jeremy Gilbert, the Post’s director of strategic initiatives. After publication, the Post also samples automation-produced articles to double-check.
For now, outlets’ bots are relatively simple, and tend to feed from a single, external data source. Still, automated journalism has already demonstrated that it can wield significant influence. The massive volume of automated content produced by the AP, which relies on financial information from Zacks Investment Research, has managed to impact stock trading, according to a Stanford study.
As these technologies get more sophisticated, and their subject matter more complex, outlets should consider whether their automation tools could produce libelous content, according to researchers from Northwestern, the University of Oregon, and the University of Minnesota. It would likely be difficult for public figures to win libel cases against media organizations, since their attorneys, in order to prove actual malice, would have to demonstrate that a programmer knew, or should have known, that their algorithm would produce false statements harmful to someone’s reputation. The authors speculate that a public figure could possibly succeed by “introduc[ing] evidence that the editor or publisher knew, or should have known, that the algorithm was not foolproof but published the content without editorial review,” but caution that the plaintiff might still have to prove malice.
More concerning for media organizations, the authors argue, might be private citizens who pursue libel cases, who often just need to prove that outlets were negligent. Negligence could be “failing to properly clean input data or fact-check the algorithms’ output” that implicates private individuals.
In the meantime, Tamedia’s journalists are preparing Tobi to pump out thousands more articles in May. The bot will report on votes related to fiscal policy and gun regulation, as well as the Grand Prix von Bern, a popular ten-mile race in Switzerland. If all goes well, Tobi should more than keep pace.
Rebecca Heilweil is a journalist and researcher based in New York. She has written for Wired, Slate, and the Wall Street Journal. Find her on Twitter @rebeccaheilweil
Production DetailsV. 1.0.1
Last edited: April 24, 2019
Author: Rebecca Heilweil
Editor: Alexander Zaitchik
Artwork: Photo by Alex Knight on Unsplash