What I’ve learned about AI and why journalists should care

Florencia Coelho
Jun 6 · 7 min read

And why I’m excited about satellite imagery, the environment, and human rights

Deep Solar by Stanford

What we need to understand as journalists is that AI offers a new dimension of opportunities and challenges to build upon computational and data journalism.

how useful it will be for investigative journalism but definitely, there are interesting solutions and I think its usage is going to spread in newsrooms around the world. As it happened with blogs, video, mobile, social media, and data journalism.

Scientists argue about the name Artificial Intelligence and that it should be renamed. Either way, there are several definitions of AI and I’ll stay with this one from Google Dictionary.

“Artificial Intelligence is the theory and development of computer systems able to perform tasks that normally require human intelligence, such as visual perception, speech recognition, decision-making, and translation between languages.”

It’s intrinsically related to the world of computer science (CS) and data science (DS). I’ve heard AI named as “fancy statistics”.

Image: inspired by the original work of

TYPES OF AI

The first recorded use of the term “artificial intelligence” was in 1955, by Professor John McCarthy at Stanford University.

Interested in history? You can navigate a prepared by Stanford’s new

There are two main types of AI you should understand right now. They’re usually called Strong/General AI and Weak/Narrow AI.

Strong or General AI. This type implies that computer systems have consciousness, sentience and self-awareness. They could ideally multitask among different challenges and projects.

This type is NOT happening anytime soon and as I heard from scientists and researchers it’s doubtful it will ever exist.

This is the scary kind of AI you might see in movies, where robots come after the humans. At one conference, I heard that some consultants are raising money to research on General AI exploiting people’s fear of this type of AI.

Weak or Narrow AI. This is the type that has been evolving to solve specific problems. It can reason logically, find patterns, and “learn” within a scope of focus.

This is the kind of AI that is referred to, tested and used in academic, governmental and business areas.

As journalists, we should perhaps be more concerned about bad stuff associated with Weak AI, like autonomous weapons, and the increased automation in specific job markets.

Subdomains of Narrow AI.

From there, different subsets of AI have been developed.

The most relevant for journalism that you’ll probably read about are Machine Learning, Natural Language Processing (NLP), Speech, Vision, Expert Systems, Robotics, etc.

These subdomains have other subsets. For example, Machine Learning can be Supervised or Unsupervised. But you will also read about and deep learning’s neural networks.

DO NOT PANIC!

What journalists need to know is that there are different subsets of those subdomains of AI. They have different goals and they even intertwine within projects and challenges.

Machine Learning is the predominant and has general purpose algorithms which are also used in the more specific AI subdomains (e.g language, speech, vision). Expert systems are an older version of AI but it has been

In this graphic, one of many included in the , you can see how Machine Learning has led AI research papers over the last 15 years.

Scopus is a citation database of peer-reviewed publications

MOMENTUM

Why is AI exploding? What’s happening?

The most basic answer is: data + technology

Data: Large amounts of data are needed to train computer systems and models using AI techniques. Governments and corporations are producing those required amounts. Journalists can work with a large volume of data obtained through leaks, scraping, Freedom of Information Act requests, open data, etc.

Technology: Data storage capacity and computing processing power have increased too. The challenge is money. The financial cost can become a burden, depending on the quantity of data being analyzed. News organizations and universities will probably need to collaborate to help some journalism projects happen.

AI AND JOURNALISM

So, why should we care?

1) To fulfill our journalistic mission.

2) To take advantage of a great opportunity for newsrooms.

Any journalist with a strong mission for public service should understand the challenges AI presents to society.

It’s already happening.

Decision-making algorithms being run on government and corporate projects are producing .

Is your city using facial recognition tools to pursue criminals? What are they doing with the data? How many false positives do they have? How are they preserving privacy rights? Is it a worthwhile balance of security versus privacy? What happens if an abusive government uses the technology for the oppression of dissidents and control of its own followers?

On the other hand, in an era of shrinking newsrooms and competition with digital-only players, a long-term strategy using AI solutions for the different phases of the news cycle sounds like a must-have.

I’ve been collecting inspiring examples for journalists and have shared below some relating to different subsets of narrow AI.

Machine Learning (general purpose algorithms)

Supervised (implies labelled training data): “…collected more than 100,000 disciplinary documents. To assist us in identifying those involving sexual misconduct, we then created a computer program based on “machine learning” to analyze each case and, based on keywords, [gave] each a probability rating that it was related to a case of physician sexual misconduct. We then read all the documents in over 6,000 cases to determine the nature of each case and board action. …” (

Unsupervised: “used unsupervised machine learning to find hidden patterns in a data set of 140,000 human-entered incidents documented by the Gun Violence Archive (GVA). After discovering a host of errors in the initial data, the AP used unsupervised machine learning to simplify the data and flag certain entries for further review without specific guidance” ( p. 10. )

Natural Language Processing (NLP)

The most comprehensive guide I’ve found during these last months is, in which he states different examples that use custom NLP.

Topic modeling, is about a small group of lawyers and its outsized influence on the U.S Supreme Court. ( J. Stray, 4.1)

Sentiment analysis. The Washington Post’s ( J. Stray, 4.2)

Speech

Speech to text. In project, “…We also captured and stored TV spots and videos posted on Youtube, using Google’s Speech-to-text API to [transcribe] the audio content of these videos to text. By the end of the first round campaign, we had [transcribed] more than 95 hours of videos. …”

I’ve found more examples of different AI subdomains and subsets. You can check them on my, using tag combinations with “journalism.”

What am I excited about? Satellite imagery, the environment, and human rights.

I’m interested in the opportunity to combine satellite imagery and AI tools.

Take a look at some examples.

Environmental projects

Stanford has projects to identify in North Carolina and in the United States.

Standord.edu

This year’s datathon winners worked to detect, using satellite imagery.

project “ …used a deep learning model to search satellite images of 70,000 km² in northern Ukraine for traces of illegal amber mining. ()

And Reuters used satellite imagery and AI that provided a first pass. Then their team manually edited the initial results to ensure there weren’t any false positives or missed buildings to.

Human Rights

Amnesty International’s project,, used crowdsourcing and transfer learning to automatically analyze satellite imagery on a country-wide scale in Sudan.

AP’s “Seafood from Slaves” story where The AP used satellite imagery to secure high-resolution images of sea vessels in Southeast Asia. Reporters gathered critical evidence for an investigative project on abuses in the seafood industry that won a Pulitzer Prize for Public Service in 2016. (Detailed behind the scenes, in, Vision p. 14)

Human Rights Watch wishes to apply a neural network to scale an expert eye and tell the difference between smoke plumes and puffy white clouds,.

This is the end of my journey at Stanford. Now I’m returning to Argentina and LA NACION, where I plan to increase my understanding in this field, work on related projects and exchange experiences within my extended data community.

Keep in touch! Twitter @fcoel or fcoelho [at] stanford [dot] edu

BIBLIOGRAPHY

To learn more about opportunities and tools for newsgathering, production, and distribution, these have been my favorite sources.

by F. Marconi and A. Siegman

by Jonathan Stray

by Jonathan Stray (NLP)

with Nicholas Diakopoulos.

(videos and papers)

by Nicholas Diakopoulos

by Cathy O’Neil (algorithm bias)

by Meredith Broussard (expert systems)

by Daniel Kirsch y Julius Tröger (explanation of different machine learning algorithms). It’s in german but you can use Google Translate.

JSK Class of 2019

Insights and updates from members of the John S. Knight Journalism Fellowships Class of 2019 at Stanford University

Florencia Coelho

Written by

JSK Stanford Fellow. Class of 2019. LA NACION (Argentina). #neverstoplearning

JSK Class of 2019

Insights and updates from members of the John S. Knight Journalism Fellowships Class of 2019 at Stanford University