Gaining Traction in AI: Machine Learning and Data
AI is the Engine, Data is the Fuel
(Want to get insights into emerging tech on a more regular basis? Sign up for the official Traction Report newsletter here).
This is Part 3 of an ongoing series on Gaining Traction in AI, which will end up as a comprehensive report on AI. You can read Part I, The Economics of AI, and Part 2, Standing in the Shadow of AI. Please sign up here to be alerted about other posts and the report.
Please participate in our newest survey about AI adoption, here. Thank you for your input. It really helps our research. We will share your responses in a future post in this series.
This post is subtitled AI is the Engine, Data is the Fuel, with a nod to the rise of driverless vehicles, perhaps. But another aspect of the engine and fuel analogy is the hard reality: building workable AI applications requires data — a lot of data — before the apps can do any work for us. So before we can delve very deep into AI, we have to start at data.
Start with the Data
It’s commonly understood that training AIs to perform some complex behavior, like facial recognition, requires training the neural network in the beginning by exposing it to a large set of facial images before it can competently identify people at a level of precision comparable to humans. But someone developing such a system would likely want to know at the outset how many images are needed. In practice, this learning threshold is based on the model used by the Machine Learning (ML) system.
For example, imagine a facial recognition approach based on finding certain nodes in a scan of a face—like the tip of the nose, cheeks, center of the chin—and then computing the relative distances between all these nodes. The number of parameters — the location of the 20 or so nodes, and the set of all the calculated distances, and other factors based on the ML approaches being used — are a starting point for estimating the sample size required.
But to try to give a simple answer to ‘how much data is needed.” In one recent study, a machine learning model required hundreds of thousands of faces to make the facial recognition work.
A Google AI researcher recently discovered two new Earth-like planets by training Google’s machine learning systems on a massive NASA dataset from its Kepler mission.
Kepler observed about 200,000 stars for four years, taking a picture every 30 minutes, creating about 14 billion data points. Those 14 billion data points translate to about 2 quadrillion possible planet orbits! It’s a huge amount of information for even the most powerful computers to analyze, creating a laborious, time-intensive process. To make this process faster and more effective, we turned to machine learning.
In another field, but still relying on image-based analysis, Stanford researchers have developed a system called CheXNet to identify pneumonia in x-rays using a public dataset of over 100,000 images. These images are annotated with indicators associated with 14 causative diseases visible to a well-trained radiologist.
And as a third example, Yann Lecun recounts that 60,000 images were required to train a system to identify letters in images with greater than 99% accuracy.
It’s worth noting that until recently the gating factor to develop AIs was compute time: the complexities in earlier approaches to ML, based on more formal math and less experimental models, could be so large that processing would take calendar months. New hardware and new ML strategies have dropped that by several orders of magnitude. Similarly, the world is creating more data all the time, so in many fields of interest there is more data than can be productively used, and in some areas, the AIs are themselves creating new data, like Go-playing and chess-playing AIs that improve by playing games against versions of themselves, adding exponentially to the data available to improve.
The practical answer to the question of “how much data is needed” is manifold. First of all, people working in this field do not start from scratch when building an AI. They start with some existing framework, or system, and modify it. As a result, they have an empirical understanding of what was involved in the earlier model. And most importantly, modern approaches are iterative and geared towards improving the performance of an AI, one step at a time. As such, they incrementally add more data — for example, more tagged images of faces — cycle by cycle, until the performance gets to the levels desired, like 90%+ facial recognition. Therefore, fewer projects require some calculated threshold of data at the outset, and by the end the amount of data used is both irrelevant and constantly growing.
Before They Think, They Have To Learn
What all this boils down to is this: For AIs to be able to learn enough so they can surpass human performance, even as they parrot our behaviors, they need access to enormous volumes of data.
Facial recognition software requires for the AI to look at hundreds of thousands of faces just to start the training, and that quickly grows to millions of faces to get the accuracy up to levels where it is useful.
Nonetheless, companies are barreling ahead with facial recognition for a hundred different purposes. Apple has embraced facial recognition as the primary way to unlock your phone for its new iPhone X, and as a means of user identification.
The next step is not merely identifying faces, but reading their expressions. Apple acquired a company (and DEMO alum) called Emotient a few years ago that can do just that.
Others are also working on this challenge. Rana al Kaliouby at Affectiva is exploring gauging emotional response through AI facial analysis. One obvious application for this technology is marketing. At a big data conference she described analyzing the meaning of smiles and furrowed brows:
She [Kaliouby] said that her company had analyzed more than two million videos, of respondents in eighty countries. “This is data we have never had before,” she said. When Affectiva began, she had trained the software on just a few hundred expressions. But once she started working with Millward Brown [a global market research company] hundreds of thousands of people on six continents began turning on Web cams to watch ads for testing, and all their emotional responses — natural reactions, in relatively uncontrolled settings — flowed back to Kaliouby’s team.
Affdex [Affectiva’s product] can now read the nuances of smiles better than most people can. As the company’s database of emotional reactions grows, the software is getting better at reading other expressions. Before the conference, Kaliouby had told me about a project to upgrade the detection of furrowed eyebrows. “A brow furrow is a very important indicator of confusion or concentration, and it can be a negative facial expression,” she said. “A lot of our customers want to know if their ad is offending people, or not really connecting. So we kicked off this experiment, using a whole bunch of parameters: should the computer consider the entire face, the eye region, just the brows? Should it look at two eyebrows together, or one and then the other?” By the time Kaliouby arrived in New York, Affdex had run the tests on eighty thousand brow furrows. Onstage, she presented the results: “Our accuracy jumped to over ninety per cent.”
An AI deciding that the specific angle of a person’s brow indicates a deep emotional response to a new pair of sneakers is something that would match or surpass what the most skilled shoe sales person could do. And perhaps is more likely to help Nike sell more shoes.
Learning To Drive
Teaching cars to drive themselves also requires massive amounts of data. Building the ‘maps’ that allow autonomous cars to work requires millions of miles of driving video and sensor data, and once on the street, an autonomous car can generate a gigabyte of data per second. Other data, such as the communications between cars or weather telemetry information can also be mined for information of importance to driverless vehicles, whether cars, trucks, or delivery drones.
This data falls into various categories: relatively slow-to-change information like the relative position of buildings, traffic lights, and trees relative to each other and to the street; and highly-variable information, like cars moving through an intersection or pedestrians in a crosswalk. And of course, there are other sorts of information with intermediate levels of variability , like bus schedules, or planned construction activities that have a major impact on traffic patterns.
The Value of Data
The same sorts of considerations are likely to apply in widely diverse domains, such as AIs looking for trends in the commodities markets, or supply chain AIs trying to optimize across an ecosystem of part manufacturers, logistics firms, and raw materials’ prices. In these and other economic niches, AIs will require huge data sets just to get started, and will subsequently need to have a constant flow of data from the domain of interest to keep fueling their learning. That’s why hospitals, grocery chains, and urban traffic systems will need to install and connect billions of sensors: to provide a baseline of data and an on-going stream of updates to feed the voracious data hunger of AI-driven activity.
One critical takeaway is that the data to make these AIs run will be incredibly important, and valuable. How the world ultimately deals with the costs and benefits of this explosion of data is perhaps the central question of economics for the near future. We are moving onto a footing where the data to make AIs work may become the foundation of most new economic value, displacing land and energy in importance.
Please participate in our newest survey about AI adoption, here. Thank you for your input. It really helps our research.