AI is the new horizon for news
Augmented Newsroom, Conversational Journalism, and Algorithmic Accountability
In January 1818, Mary Shelley published the first edition of Frankenstein or The Modern Prometheus. Her work became the template for all modern science fiction.
Two centuries later, we still have young and promising scientists such as Victor Frankenstein and we still have ‘creatures’ generated by humans. But today, Shelley would call these creatures artificial intelligence, machine learning, voice AI, and big data.
The first difference with Shelley’s novel is that there is not only one ‘creature’, but today, we have millions of creatures that we must live with. The second difference is that Shelley’s creature kills itself at the end of the story, but we cannot rationally expect that Facebook, Google, Microsoft, Apple, Baidu, Toutiao, and Tencent will kill theirs.
These global tech companies are the post-modern Prometheus, but how can we distinguish the myth, the fantasy, and the reality behind the AI creatures they brought to life? What impact will they have on news media and journalists?
McKinsey Global Institute predicts that automation technologies will have an economic impact of between $5.2 trillion and $ 6.7 trillion by 2025. How can we imagine that the news industry will not be affected by such a revolution?
Let’s start with a first assertion: if driverless cars already exist in 2018, journalist-less news will happen very soon. We don’t know where this 100 percent automated news media will be created first — China, USA, Japan, Europe or in an emerging country — , but it will happen and it will be quite different from ancestors such as Krishna Bharat’s baby Google News, and other news aggregators of articles written by human beings. My goal is not to play the role of a Cassandra, but to understand what the implications for our profession will be.
One obvious myth is that more machines and algorithms will mean less jobs in newsrooms, which raises questions about the fruits of AI productivity. Nevertheless, even if the transition will be challenging for many media workers, innovation transforms jobs and skills by creating new services and products, which in turn leads to more employment opportunities. Again for McKinsey, ‘automation and AI will lift productivity and economic growth, but millions of people worldwide may need to switch occupations or upgrade skills’. Including journalists. Same idea developed by two Accenture technologists in this op-ed.
The 19th century Luddites were English workers who destroyed manufacturing machinery, hoping that it would save their jobs. Journalists and editors cannot afford to be this century’s Luddites. Ignoring the development of new technologies is not the solution. Nevertheless, the questions asked by the likes of Stephen Hawking or Elon Musk about humanity’s place in the future, alongside computers and algorithms, are 100 percent valid. We cannot be naive or careless about privacy, mass surveillance, the misinformation age, and the ethics of autonomous decisions.
This is why we, at the Global Editors Network, decided to focus the GEN Summit 2018 (in Lisbon from 30 May — 1 June) on AI and its byproducts: voice AI, news bots, robo-writing software, integrative intelligence, in addition to the intersections of AI with other tech topics like blockchain and data journalism.
I believe AI will be the catalyst of the third disruption in journalism. Twenty-five years ago, the first disruption was the widespread availability of the Internet and the free access to information. The second one was the rise of the smartphone, which meant a single device with one small screen for news, services, entertainment, and social networks. This third disruption will have the same amplitude as the first two and will potentially change the way we produce and consume news.
Three shifts in less than one generation is no small burden for newsrooms. We understand that it’s exhausting for many journalists and editors, but there is no way to avoid it: It’s time to think about the ‘augmented newsroom’, the new space — virtual and real — where journalists will have to combine machine-written news with in-depth reporting; photojournalism with surveillance camera images; professional videos with user-generated posts.
Here are some things to prepare for the shift from a digital newsroom to a genuine ‘augmented newsroom’:
1. AI is giving the gift of ubiquity to news
Text, audio, and video are no longer just different content types; they are now mediums that can intersect and overlap thanks to text-to-video, speech-to-text technologies and speech-to-speech automated translation: any article can become a video and any audio or video report can be transformed into a text piece (when it is relevant).
A first example is the shift to video, a major trend when you look at figures provided by platforms, particularly YouTube and Facebook. News will not be an ivory tower that’s separate from this trend, legacy media will have to follow the users’ habits for more visual journalism. Nevertheless, even if video teams are growing in the majority of newsrooms, it will not be enough to satisfy the current video crunch. Nobody believes that ‘text is dead’, but there is a huge development potential for companies such as Wibbitz or Wochit in transforming text — without any human intervention — into video segments incorporating voice, images, and video.
Another development of AI will happen when automated translation will allow a Norwegian reader of Aftenposten to read articles in El País and La Vanguardia on the Catalonian elections in the reader’s native language. The next step will be instantaneous translation, but not for a few years!
2. Voice-powered AI is not a tech issue, it is the birth of conversational journalism
Users will have the possibility to ask more of their voice assistants, such as Amazon’s Alexa and Google Home, and machine learning will allow journalists to provide better interaction with the machine. At the moment, interaction is quite limited: you can ask for the latest news but it is very difficult to get more in-depth information. In a few years, however, we can expect that voice search will be as sophisticated as classical search engines: your voice will replace your fingers and voice AI will become an endless source of news — and revenues, especially when they will be connected to other IoT devices or smart TVs. Today less than 10 percent of customer interactions involve voice-powered AI, but it will be more than 50 percent in a few years, according to studies from Servion or McKinsey. My prediction is that by 2020, some newsrooms will set up a newsroom assistant (NA) similar to Alexa or Siri, but based on the newsroom’s data (text, photos, videos, messages, slack discussions, etc). The 2020 NA will be a mix of knowbot, chatbot and probot, a great new tool empowering journalists, facilitating human-AI collaboration, and simplifying the daily workflow. Potentially the NA will be used through smart glasses and young journalists will speak to it as if they were speaking to a colleague or a stringer. Clearly, the goal of this newsroom assistant will not be to produce automated news, but better informed articles, podcasts, or videos. The next stage of newsroom assistants will be to pilot the process of production of automated and personalised news.
It means that the traditional AI cycle — data to prediction to decision — will become for the automated news cycle data to content to conversation.
3. Automated journalism will affect all newsroom departments and it will enhance news personalisation
The future of automated journalism will not be limited to sports, economic, or election coverage as it is today at the Associated Press, Reuters, Guardian, or Le Monde, thanks to pioneers Narrative Science and Automated Insights in the US or Syllabs in France. Robot-writing software today depends on databases of pre-registered texts and a limited number of sentences, but this format will probably be considered media archeology in a few years. Machine learning, natural language processing, and natural language generation will produce more sophisticated text when geolocalisation and behaviour analysis will allow more personalisation. It will become very difficult to distinguish a story written by a journalist from a story produced by a machine and a tailor-made article from a classical one.
Take the example of an article about the US president: an automated story writing system will have the possibility to simultaneously deliver a pro-Trump and an anti-Trump article with the same data, but the headline, the tone, the writing, and even the conclusion will be different. In this case, AI will be the amplifyer of the bubble filter phenomenon on social networks where you only see things that match your opinion. No doubt it will make marketers very happy and editors very unhappy.
Take the example of an obituary. A human author will be rather personal, but the machine will produce a more detailed biography with more photos, quotes, and comments updated in real time, or even an interactive story based on dates, locations, and topics.
Another example are interviews. Today, journalists are preparing their own questions through search engines, video platforms, and digital archives. Tomorrow, they will rely on a robot for finding more in-depth questions linked to the very latest news. Additionally, the same journalist who can today manage one major interview per day will have the possibility to write two or three if he or she works with an automated transcription service.
Last example about comments: Jigsaw (formerly Google Ideas) launched Perspective, ‘a new tool for web publishers to identify toxic comments that can undermine a civil exchange of ideas.’ New York Times, The Guardian, and The Economist partnered with Jigsaw to help filter comments with much more efficiency than dozens of moderators.
4. AI will improve engagement when combined with UGC and crowdsourcing
Basically, you need a database to produce AI-based narratives. And there are three possibilities for getting a database: produced by your journalists or analysts, hacked or bought from a third party, and then generated by your users. This third possibility is the most promising. Your users will provide content not in a disorganised fashion but through a frame defined by the software and remotely curated by a journalist. Not any user will become a content provider, but individuals committed to their community or their activity will be the first targets. It is the role of media organisations to gather those doers and makers, even if the turnover will be frequent. Limiting them to fill forms and templates will be also an issue after a certain period.
5. AI will allow newsrooms to set up a meta-newsroom covering new topics and beats
A newsroom is logically limited by its number of content producers. Some media organisations are more productive than others, but generally, there is a ratio of the number of journalists to the quantity of content produced. With AI, you can get a meta-newsroom that produces a lot of content with a limited number of curators. This will allow newsrooms to broaden coverage on hyperlocal issues like community sports, local union activity, or citizen initiatives. Machine learning will also allow reporters to write better and more in-depth stories.
Today’s best example of AI-powered news is provided by Urbs Media and Press Association for local news in the UK and Ireland. Alan Renwick, co-founder, says that ‘since December 2017, we are publishing dozens of automated news stories in UK local newspapers and websites every day’. In 2019, the system called RADAR will supply a daily dose of stories for every local market in the two countries and the target is 30,000 stories per month. Hopefully, Alan will be with us in Lisbon at the GEN Summit.
Another example is given by John Keefe, developer in the Quartz Bot Studio. For him, new kinds of stories written by humans who understand how to use machine learning were already produced in 2017 and he gave two examples in a recent Nieman Lab article:
§ ProPublica’s Jeremy Merrill used machine learning to detect the issues uniquely important to each member of Congress.
§ The Atlantic’s Andrew McGill used machine learning to figure out whether Donald Trump is writing his own tweets.
John — another speaker at the GEN Summit — is quite optimistic for the future: “In the new year, we’ll be talking about how often reporters deployed artificial intelligence to land big stories”.
6. Algorithmic accountability and transparency will be essential
AI is not a ‘black box’. Journalists must ask for more accountability and transparency. As algorithms are increasingly shaping content production, it is apparent that they are new power structures that warrant more scrutiny. Journalists must analyse the biases of algorithms to better hold their creators accountable for the power they exert and to allow a de-biasing process within data sets. For instance, one notable bias of algorithms is that they reproduce stereotypes about gender, race, and poverty. Recently, at the 34th Chaos Communication Congress (34CCC) in Leipzig, Katharine Jarmul, founder of data analysis company kjamistan, presented the latest trends of adversarial machine learning and showed how people can modify their own photos or videos in order to avoid facial recognition by Facebook or Google, and consequently forms of mass surveillance. Anonymising data will be a key issue for citizens and data owners, but it is in contradiction with the essence of AI: personalisation. Unfortunately, I’m not sure that the power of the crowd will balance the one of hundred of thousands talented engineers.
Another issue to solve is transparency: is it important to know if an article is produced by a robot or a human? Will we in a couple of years have to add a disclaimer at the end of some stories that they are ‘machine-written’? This debate is important for the journalists (the production side) as much as it is important for the readers (the consumption side).
7. Automated news and automated fake news and videos are the two faces of the same coin
The worst is never sure, but what is certain is that mass misinformation will be powered and facilitated by AI. If engineers create automated news today, they will also create automated fake news and fake videos tomorrow. It has in fact already started, according to enquiries about so called “Russian memes” during the recent US and UK elections.
Fake news are very similar to fake reviews about services, hotels, restaurants, books and products or, more recently, damaging faked videos (through the AI-doctored FakeApp desktop program using face-swapping algorithms) allowing blackmail and/or harassment.
Nicholas Confessore, New York Times, also revealed that fake twitter followers and fake persuaders provided by bots are becoming common. According to Eric Schneiderman, New York State Attorney General, “the growing prevalence of bots means that real voices are too often drowned out in our public conversation. Those who can pay the most for followers can buy their way to apparent influence”.
Gravwell analysed 22 million comments for the FCC in the US and a very small minority of them (only 17%) were unique, all other messages were repetitive or produced and/or distributed through algorithms. Is it the future we want for automated news?
At the moment, the debate around so-called ‘fake news’ is a bit naive: it’s all about trust and how to debunk hoaxes and fabricated information, as if the truth will triumph, thanks to an army of human fact-checkers or even automated fact-checking systems. The reality will be different because the fight will happen on social media, blogs, peer-to-peer communications, and the darknet, not on mainstream or legacy media. Potentially, millions of small fake news, fake comments and fake accounts will never be checked, but it will be enough to weaken democracies.
Tomorrow, the second inherent risk of AI will not be the loss of data, but the loss of their integrity. Hackers will have the possibility to enter our databases and systems, slightly modify or manipulate the data: maybe just 0.1%, but a crucial piece of information. It means that news media might not know if the integrity of their data was damaged or altered.
Blockchain
Blockchain is not AI. The former is decentralised and the latter is hyper-centralised. However, media organisations must understand that both will have a similar impact in the coming years because they are consumer-centric.
While the media industry as a whole is still warming up to Blockchain, some companies like Hubii and Civil are trying to solve the basic issues of news personalisation, monetisation, misinformation, and fact-checking using blockchain.
‘Journalism will be one of the first, truly consumer-facing applications of blockchain technology’, said Daniel Sieberg, co-founder of Civil and a GEN Summit speaker in Lisbon end of May.
I also agree with David Schlesinger, an advisor at Hubii, who said that ‘blockchain, by disintermediating gatekeeper companies, can put control in the hand of makers and users’. Blockchain will potentially allow content creators to earn more, distributors to pay less, and consumers to have more choices. In another words, blockchain will help us set up a new content marketplace for news.
Last but not least, in a world of mass misinformation, blockchain can become the essential tool for traceability and fact-checking and it is the reason why we also invited Jacobo Toll-Messia, Hubii CEO, in Lisbon.
In conclusion, I want to highlight seven ideas to digest for the year 2018:
- AI is giving the gift of ubiquity to news: any article can become a video and vice-versa, any audio or video report can become text.
- Long life to conversational journalism. Applied to news, the AI cycle — data to prediction to decision - will become data to content to conversation.
- Automated journalism will affect all of a newsroom’s departments and it will enhance news personalisation.
- AI will improve engagement when combined with UGC and crowdsourcing. Don’t believe that your users will be AI defiant.
- AI will allow newsrooms set up a meta-newsroom that produces a lot of content with a limited number of curators.
- Algorithmic accountability and transparency will be essential for the development of AI within the journalistic community.
- Mass misinformation will be powered by AI: automated news and automated fake news or purchased bot followers are the two faces of the same coin.
All these ideas will be discussed in Lisbon at the GEN Summit 2018 and I hope to see you there. Very Happy New Year to all of you.
Bertrand Pecquerie, Chief Executive Officer, Global Editors Network