Artificial Intelligence — The New Electricity
These notes are a transcript of Professor Andrew Ng’s Stanford lecture “Artificial Intelligence — The New Electricity.” It has been edited and supplemented with further notes (by me). It is written from the perspective of Professor Andrew Ng.
I decided to share this transcript and lecture notes because of the impact it has had on me. AI has become a new buzzword but only a few really understand what AI actually is, the current state of its capabilities and the profound implications AI poses on a societal level. Professor Andrew Ng is one of the few, within the field of AI, who is bridging the gap between the insiders and outsiders through comprehensible, high-level, lectures. I am grateful for him doing so as I believe this lecture has really opened my mind about AI.
Why do I believe learning about AI is important? As Professor Ng states, and as I am convinced, AI will have a tremendous impact on our lives and the future before us. Matter of fact, it already is in numerous ways.
Professor Ng speaks about A.I. as being the new electricity, the application of AI in different industries and the economic value its driven. He teaches us about how supervised learning works, how AI is impacting the advertising industry, and how it fits into the bigger picture.
He then proposes a rule of thumb that he gives to many of his product managers as to what AI can currently automate. He admits that the rule is an imperfect one but nonetheless a good reference point. He mentions AI’s current limitations, its speed and performance (in comparison to humans), and the job implications this has for us. Professor Ng then introduces us to neural networks, often times described as deep learning, compares it to a human brain, and then provides a great example of how a neural network would work based on a example problem.
Professor Ng explains about the state of AI research and the scarcity of talent and data. He then adds his opinion about the hype behind AI and on the concern by many about AI taking over the human race. Further, he expands upon AI’s applicability to speech recognition, face recognition, and on AI’s potential effect on healthcare, finance and education. Professor Ng concludes with the job displacement issue, advocates for basic universal income, and believes that much growth can occur in AI when business minded individuals connected with computer scientists. Read the lecture below.
Background — Andrew Ng
Professor Andrew Ng is one of the leading thinkers in artificial intelligence with research focusing on deep learning. He has taught machine learning for over 100,000 students through his online course at Coursera. He founded and led the Google Brain project, which developed massive scale, deep learning algorithms.
He’s currently the VP and chief scientist of Baidu, the co-chairman and co-founder of Coursera, and last but not least, an adjunct professor right here at Stanford University. However, as of March 22nd, 2017, news has appeared about him leaving Baidu to search for his next big AI mission.
What I want to do today is talk to you about AI. Right now I lead a large AI team at Baidu, about 1300 scientists and engineers. I’ve been fortunate to see a lot of AI applications, a lot of research in AI as well as a lot of users in AI in many industries and many different products. As I was preparing for this presentation, I asked myself what I thought would be most useful to you. And what I thought I’d talk about is four things.
AI — The New Electricity
I want to share with you what I think are the major trends in AI. Because I guess the title of this talk was AI is the New Electricity. Just as electricity transformed industry after industry 100 years ago, I think AI will now do the same.
Trends in AI
I share with you some of these exciting AI trends that I, and many of my friends, are seeing. I want to discuss with you some of the impact of AI on business. Whether, I guess, to the GSPC and to the Sloan Fellows, whether you go on to start your own company after you leave Stanford, or whether you join a large enterprise, I think that there’s a good chance that AI will affect your work. So I’ll share with you some of the trends for that.
And then talk a little bit about the process of working with AI. This is some kind of practical advice for how to think about, not just how it affects businesses, but how AI affects specifically products and how to go about growing those products. And then finally, I think for the sign up of this event, there was a space for some of you to ask some questions and quite a lot of you asked questions about the societal impact of AIs. I’ll talk a little bit about that as well, all right?
AI — The New Electricity
I think on the website the title was listed as the AI is the New Electricity. It’s an analogy that we’ve been making over half a year or something. About 100 years ago, we started to electrify the United States, develop electric power. And that transformed transportation. It transformed manufacturing, using electric power instead of steam power. It transformed agriculture, it transformed healthcare and so on. And I think that AI is now positioned to have an equally large transformation on many industries.
AI in IT Industry
The IT industry, which I work in, is already transformed by AI. Today at Baidu, web search, advertising, are all powered by AI. The way we decide whether or not to approve a consumer loan, really that’s AI. When someone orders takeout through the Baidu on-demand food delivery service, AI helps us with the logistics. They route the driver to your door, helps us estimate to tell you how long we think it’ll take to get to your door. So it’s really up and down. Both the major services, many other products in the IT industry are now powered by AI, just literally impossible without AI.
AI in Other Industries
But we’re starting to see this transformation of AI technology in other industries as well. I think FinTech is well on its way to being totally transformed by AI. We’re seeing the beginnings of this in other industries as well. I think logistics is part way through its transformation. I think healthcare is just at the very beginnings, but there’s huge opportunities there. Everyone talks about self-driving cars. I think that will come as well, a little bit, that will take a little bit of time to land, but that’s another huge transformation.
But I think that we live in a world where just as electricity transformed almost everything almost 100 years ago, today I actually have a hard time thinking of an industry that I don’t think AI will transform in the next several years. And maybe throughout this presentation, maybe at the end of doing Q and A, if you can think of an industry that AI won’t transform, okay, like a major industry, not a minor one, raise your hand and let me know. I can just tell you now, my best answer to that. Sometimes my friends and I actually challenge each other to name an industry that we don’t think would be transformed by AI. My personal best example is hairdressing, cutting hair. I don’t know how to build a robot to replace my hairdresser. Although I once said this same statement on stage and one of my friends, who is a robotics professor, was in the audience. And so my friend stood up, and she pointed at my head, and she said, Andrew, for most people’s hairstyles, I would agree you can’t build a robot. But for your hairstyle, Andrew, I can.
Economic Value of AI
Despite all this hype about AI, what is AI doing? What can AI really do? It’s driving tremendous economic value, easily billions. At least tens of billions, maybe hundreds of billions of dollars worth of market cap.
But what exactly is AI doing? It turns out that almost all this ridiculously huge amounts of value of AI, at least today, and the future may be different, but at least today almost all this massive economic value of AI is driven by one type of AI, by one idea. And this technical term is called Supervised Learning. And what that means is using AI to figure out a relatively simple A to B mapping, or A to B response. Relatively simple A to B or input those response mappings. So, for example, given a piece of email, if I input that, and I ask you to tell me if this is spam or not. Given an email, output 0 or 1 to tell me if this is spam or not, yes or no? This is an example of a problem where you have an input A, you can email, and you want a system to give your response B, 0 or 1. And this today is done with Supervised Learning. Or, given an image. Tell me what is the object in this image and maybe of a thousand objects or 10,000 objects. Just try to recognize it. So you input a picture and output a number from say, one to 1000 that tells you what object this is. This, AI can do. Some more interesting examples. When you’re given an audio clip, maybe you want to output the transcript. This is speech recognition. Input an audio clip and output detects transcript of what was said, so that’s speech recognition.
The way that a lot of AI is built today is by having a piece of software learn, I’ll say exactly in a second what I mean by the word “learn”, what it means for a computer to learn, but a lot of the value of AI today is having a machine learn these input to response mappings. Given a piece of English text, output the French translation. I talked about going from audio to text or maybe you want to go from text and have a machine read out the text in a very natural-sounding voice.
It turns out that the idea of supervised learning is that when you have a lot of data, A and B both. Today, a lot of the time, we have very good techniques for automating, for automatically learning a way to map from A to B. For example, if you have a giant database of emails, as well as annotations of what is spam and what isn’t spam, you could probably learn a pretty good spam filter. I’ve done a lot of work on speech recognition. If you have, let’s say, 50,000 hours of audio, and if you have the transcript of all 50,000 hours of audio, then you could do a pretty good job of having a machine figure out what is the mapping between audio and text.
The reason I want to go into this level of detail is because despite all the hype and excitement about AI, it’s still extremely limited today, relative to what human intelligence is. And clearly you and I, every one of us can do way more than figure out input to response mappings. But this is driving incredible amounts of economic value, today.
AI and Ad Relevance
Just one example. Given some information about an ad, and about a user, can you tell me whether you usually click on this ad? Leading Internet companies have a ton of data about this, because of showing people some number of ads that we sold whether they clicked on it or not. So we have incredibly good models for predicting whether a given user will click on a particular ad. This is actually good for users because you see more relevant ads and this is incredibly lucrative for many of the online internet advertising companies. This is certainly one of the most lucrative applications we have today, possibly the most lucrative, I don’t know.
Fitting AI into Bigger Application
At Baidu, we’ve worked with a lot of product managers. And one question that I got from a lot of product managers is, you’re trying to design a product and you want to know, how can you fit AI in some bigger product? Do you want to use this for spam filter? Do you want to use this to maybe tag your friends’ faces?
Or do you want to use this, where do you want to build speech recognition in your
App? But can AI do other things as well? Where can you fit AI into a bigger product or a bigger application? Some of the product managers I was working with were struggling to understand what can AI do and what can’t AI do. I’m curious, how many of you know what a product manager is or what a product manager does? Okay good, like half of you. I asked the same question at an academic AI conference and I think only about one fifth of the hands went up, which is interesting.
Just to summarize, in the workflow of a lot of tech companies, it’s the product manager’s responsibility to work with users, look at data, to figure out what is a product that users desire. To design the features and sometimes also the marketing and the pricing, as well. But let me just say design the features and figure out what the product is supposed to do, for example, should you have a light button or not? Do you try to have a speech recognition feature or not? It’s really to design the product. If you give the product spec to engineering which is responsible for building it, that’s a common division of labor in technology companies between product managers and engineers. Product managers, when I was working with them, were trying to understand what can AI do?
Rule of Thumb
Anything typical human can do with S1 sec of thought we can probably now or soon automate with AI.
There’s this rule of thumb that I gave many product managers, which is that anything that a typical human can do with, at most, one second of thought, we can probably now or soon, automate with AI. And this is an imperfect rule. There are false positives and false negatives with these heuristics so this rule is imperfect but we found this rule to be quite helpful. Today, actually at Baidu, there are some product managers running around looking for tasks that they could do in less than a second and thinking about how to automate that. I have to say, before we came up with this rule, they were given a different rule by someone else. Before I gave this heuristic, someone else told them product managers, assume AI can do anything. That actually turned out to be useful. Some progress was made with that heuristic, but I think this one was a bit better. A lot of these things on the left you could do with less than a second of thought. One of the patterns we see is that there are a lot of things that AI can do, but AI progress tends to be fastest if you’re trying to do something that a human can do. For example, build a self-driving car, right? Humans can drive pretty well, so AI is making actually pretty decent progress on that. Or diagnose medical images. If a human radiologists can read an image, the odds of AI being able to do that in the next several years is actually pretty good.
-Feasibility -Data - Insights
There are some examples of tasks that humans cannot do. For example, I don’t think, well, very few humans can predict how the stock market will change, right? Possibly no human can. It is much harder to get an AI to do that as well. And there a few reasons for that. First is that if a human can do it, then first, you’re at least guaranteed that it’s feasible. Even if a human can’t do it, like predict the stock market, maybe it’s just impossible, I don’t know. A second reason is that if a human can do it, you could usually get data out of humans. We have doctors that are pretty good at reading radiological images. And so if A is an image and B is a diagnosis, then you can get these doctors to give you a lot of data, give you a lot of examples of both A and B, right? So things that humans can do, can usually pay people, hire people or something, and get them to provide a lot of data most of the time. Finally, if a human can do it, you could use human insight to drive a lot of progress. If a AI makes a mistake diagnosing a certain radiology image, like an x-ray scan, like an x-ray image, then AI makes a mistake. Then if a human can diagnose this type of disease, you can usually talk to the human and get some insights about why they think this patient has lung cancer or whatever and try to code into an AI.
One of the patterns you see across the AI industry is that progress tends to be faster when we try to automate tasks that humans can do. And there are definitely many exceptions, but I see so many dozens of AI projects and I’m trying to summarize trends I see. They’re all not 100% true, but 80 or 90% true. So for a lot of projects, you find it if the horizontal axis is time and this is human performance, in terms of how accurately you can diagnose x-ray scans or how accurately can classify spam email or whatever. You find that over time the AI will tend to make rapid progress until you get up to human level performance. And if you ever surpass it, very often your progress slows down because of these reasons. And so this is great, because this gives AI a lot of space to automate a lot of things.
The downside to this is the jobs implication. If we’re especially good at doing whatever humans can do, then I think AI software will be in direct competition with a lot of people for a lot of jobs. I would say probably already a little bit now, but even more so in the future. I’ll say a little about that later as well. The fact that we’re just very good at automating things people can do and we’re actually less good at doing things people also can’t do. That actually makes the competition between AI and people for jobs laborious.
Why Is AI Only Now Taking Off?
Let me come back to the AI trends. I bet some of you will be asked by your friends afterward, what’s going on in AI? And I hope to give you some answers that let you speak intelligently as well, to others about AI. It turns out one of the ideas about AI have been around for many years, frankly, several decades.
But it’s only in the last several years, maybe the last five years, that AI has really taken off. So why is this? When I’m asked this question, why is AI only now taking off? There’s one picture that I always draw. So I’m going to draw that picture for you now.
On the horizontal axis, I plot the amount of data, and on the vertical axis, I plot the performance of our AI system. It turns out that several years ago, maybe ten years ago, we were using earlier generations of AI software, earlier generations of most common machine learning algorithms, to learn these A to B mappings. Let me call this traditional machine learning algorithms. It turns out that for the earlier generations of machine learning algorithms, even as we fed it more data, its performance did not keep on getting better. It was as if beyond a certain point, it just didn’t know what to do with all the additional data you are now giving it. And here by data, I mean the amount of A, B data, with both the input A as well as the target B that you want to output. And what happened over last several years is because of MOS law and also GP use, maybe especially in GPU computing, we finally have been able to build machine learning pieces of software that are big enough to absorb these huge data sizes that we have.
What we saw was that, if you feed your data into a small neural network, we’ll say a little bit later what a neural network is, but an example of machine learning technology. If you’ve heard the term deep learning, which is working really well but also a bit overhyped. Neural network and deep learning are roughly synonyms. Then with a small neural network, the performance looks like that. If you build a slightly larger neural net, the performance looks like that. And there’s only, if you have the computational power to build a very large, neural net that your performance kind of keeps on going up. What this means is that in today’s world, to get the best possible performance, in order to get up here, you need two things.
First, you need a ton of data. And second, you need the ability to build a very large neural network. And large is relative, but because of this I think the leading edge of AI research, the leading edge of neural net research is today shifting to supercomputers, or HPCs, or high performance computers or super computers.
So in fact today, the leading AI teams tend to have this old structure where you have an AI team and you have some of the machine learning researchers. Abbreviates to ML. And you have HPC, or high performance computing or super computing researchers are working together to build the really giant computers that you need in order to hit the levels of today’s performance. I’m seeing more and more teams that kind of have an old structure like this. And the old structure is organized like this because, frankly, one of the things we do at Baidu, for example, it requires such specialized expertise in machine learning and such specialized expertise in HPC that there’s no one human on this planet that knows both subjects to the levels of expertise needed.
Neural Network vs. Human Brain
In the questions that some of you asked on the website signing up for this event, some of you asked about what evil AI killer was taking over humanity and so on. People do worry about that. So to kind of address that, I actually want to get just slightly technical and tell you what is a neural network. A neural network is loosely inspired by the human brain. So that analogy I just made is so easy for people like me, right, to make to the media, that this analogy tends to make people think we’re building artificial brains, just like the human brain. The reality is that today, frankly, we have almost no idea how the human brain works. So we have even less idea of how to build a computer that works just like the human brain. And even though we like to say, neural networks are a little bit like the brain, they are so different that I think we’ve gone past the point where that analogy is still that useful. It’s just that maybe, we don’t have a better analogy right now to explain it.
What is a Neural Network?
Let me actually tell you what a neural network is, and I think you’ll be surprised at how simple it is. Let me show you an example of the simplest machine learning problem, which is, let’s say you have a data set where you want to predict the price of a house. You have the data set where the horizontal axis is the size of the house, and the vertical axis is the price of the house, square feet, dollars. So you have some data set like this. What do you do? You fit a straight line to this.
This can be represented by a simple neural network, where you input the size, and you output the price. This straight line function is represented via a neuron, which I’m going to draw in pictures as a little circle. And, if you want a really fancy neuron, maybe it’s not just fitting in a straight line. If you’re smart you realize that price should never be negative or something But the first approximation, let’s just say is, fitting a straight line. Maybe you don’t want it to be negative. This is maybe the simplest possible in your network, one input, one output with a single neuron. So what is in neural network? Well, it’s just to take a bunch of these things, where you take a bunch of these things, and stringing them together.
Instead of predicting the price of house just based on the size, maybe you think that the price of a house actually depends on several things, which is, first, there’s the size, and then there’s the number of bedrooms. And depending on the square footage and the number of bedrooms, this tells you what family size this can comfortably support. Can this support a family of two, a family of four, a family of six, whatever. Based on the zip codes of the house, as well as the average wealth of the neighborhood, maybe this tells you about the school quality. We have two little neurons, one that tells us a family size, a house can support, and one that tells us the school quality and maybe the zip code also tells us, how walkable is this. If I’m buying a house ultimately what I care about are my family size and support, is this a walkable region, what’s the school quality. So let’s take these things and string them into another neuron, another linear function that then outputs the price.
This is in your neural network and one of the magics of a neural network is that, I’ll give this example, as if when we’re building this neural network, we have to figure out that family size, walkability and school quality are the three most important things that determine the price of a house. As I drew this neural network, I talked about those three concepts. Part of the magic of the neural network is that when you are training one of these things you don’t need to figure out what are the important factors, all you need to do is give it the input A and it responds B and it figures out by itself what all of these intermediate things that really matter for predicting the price of a house. And part of the magic is when you have a ton of data, when you have enough data, A and B, it can figure out an awful lot of things by itself.
I’ve taught machine learning for a long time, I was a full-time faculty at Stanford for over a decade, now I’m still adjunct faculty in the CS department. But whenever I teach people the mathematical details of a neural network, often I get from the students like almost a slight sense of disappointment. Like “is this really this simple, you gotta be fooling me”, but then you implement it and it actually works when you feed it a lot of data. Because all the complexity, all the smarts of the neural network comes from us giving it tons of data. Maybe tens of thousands or hundreds or thousands or more of houses and their prices, and only a little bit of it comes from the software, so the software. Software is really not that easy. The software is only a piece of what the neural network kind of knows. The data is a vastly larger source of information for the smarts of the neural network than the software that we have to write.
AI Research Community
One of the implications of this is, when you think about building businesses, we think about building products of businesses, what is the scarce resource. If you want to build a defensible business that deeply incorporates AI, what are the moats? Or how do you build a defensible business in AI?
Today, we’re fortunate that the AI community, the AI research community is quite open. Almost all, maybe all of the leading groups, tend to publish our results quite freely and openly. And if you read our papers at Baidu, we don’t hold anything back. If you read our state of the art speech recognition paper, our state of the art face recognition paper, we really try to share all the details. And we’re not trying to hide any details. Many leading researchers in AI do that, so it’s difficult to keep algorithms secret anyway. So how do you build a defensible business using AI? I think today, there are two scarce resources. One is data, it’s actually very difficult to acquire huge amounts of data, right, A, B.
Maybe to give you an example, one of the projects, well a couple examples, speech recognition, I mention just now we’ve been training on 50,000 hours of data. This year, we expect to train about 100,000 hours of data. That’s over 10 years of audio data, right? So literally, if I pull my laptop and start playing audio to you to go through all the data our system listens to, we’ll still be here listening until the year 2027. This is massive amounts of data that is very expensive to obtain.
Or take face recognition. We’ve done work on face recognition. So to say some numbers, the most popular academic computer vision benchmark slash competition has researchers work on about 1 million images, and the very largest academic papers in computer vision publish papers on maybe 15 million images, of the kind of recognizing objects from pictures or whatever. At Baidu, to train our really leading edge, possibly best in the world, but I can’t prove that, definitely very, very good face recognition system, we train it on 200 million images. This scale of data is very difficult to obtain. And I would say that, honestly, if I were leading a small team of five or ten people, I would have no idea, frankly, how to replicate this scale of data and build a system like we’re able to in a large company like I do, with access to just massive scale data sets.
And in fact, at large companies, sometimes we’ll launch products, not for the revenue, but for the data. We actually do that quite often. Often I get asked, can you give me a few examples, and the answer, unfortunately, is usually no, actually. But I frequently launch products where my motivation is not revenue but is actually data, and we monetize the data through a different product.
I would say that today in the world of AI, the most scarce resource today is actually talent because AI needs to be customized for your business context. You can’t just download an open source package and apply it to your problem. You need to figure out where does the spam filter fit in your business or where does speech recognition fit in your business. And what context, where can you fit in this AI machine learning thing? And so this is why there is a talent war for AI because every company, to explore your data, you need that AI talent that can come in to customize the AI, figure out what is A and what is B, where to get the data, how to tune the algorithm to work for your business context. I’d say maybe that’s a scarce resource today.
And then second is data. It is proving to be a defensible barrier for a lot of AI-powered businesses.
Virtuous Circle of AI
There’s this concept of a virtuous circle of AI that we see in a lot of products as well. You might build a product. For example, we built a speech recognition system to enable a voice search, which we did at Baidu. The US search companies have done that, too. The speech recognition system, whatever, some product, because it’s a great product, we get a lot of users. The users using the product naturally generate data, and then the data through ML feeds into our product to make the product even better. This becomes a positive feedback. That often means that the biggest and the most successful products, the most successful products, the most successful, the best product, often has the most users. Having the most users usually means you get the most data, and with modern ML, having the most data sometimes, usually, often means you can do the best AI, that’s machine learning. And therefore have an even better product, and this results in a positive feedback loop into your product. And so when we launch new products, we often explicitly plan out how to drive this cycle as well. I’m seeing pretty sophisticated strategies in terms of deciding how to roll out the product, sometimes by geography, sometimes market segment, in order to drive this cycle. Now this concept wasn’t around for a long time, but this is really a much stronger positive feedback loop just recently, because of the following reasons.
Traditional AI algorithms work like that, so there was kind of beyond a certain point, you didn’t need more data. This is data performance. I feel like ten years ago data was valuable, but it created less of a defensive barrier because beyond a certain threshold, the data, it just didn’t really matter. But now the AI works like that, the data is becoming even more important for creating defensible barriers for AI kind of businesses.
Non-Virtuous Circle of Hype
Robbie was kind enough to take the audience questions from the sign-up form and summarize them into major categories. So he summarized the questions into your major heading categories. One of them was AI society impact. One was your practical questions for AI. One of the headings that Robbie wrote was scared. As in, “will AI take over the human race or kill humans or whatever”? I feel like there is this, a circle of AI. I’m going to call it the non-virtuous circle of hype. When preparing for this talk, I actually went to a thesaurus to look up antonyms, opposites, of the word virtuous, and vile came up. But I thought, vile circle of hype was a bit too provocative. Unfortunately, there is this evil AI hype.
AI take over the world instead of humans, whatever. Unfortunately, some of that evil AI hype, right, fears of AI, is driving funding, because what if AI could wipe out the human race? Then sometimes we have the individuals, or sometimes government organizations or whatever. They now think, well, let’s fund some research, and the funding goes to anti-evil AI. The results of this work drive more hype, and I think this is actually a very unhealthy cycle that a small part of AI communities are getting into.
Unfortunately, I see a small group of people, with a clear financial incentive to drive the hype, because the hype drives funding to them. I’m actually very unhappy about this hype. I’m unhappy about it for a couple of reasons. First, I think that there is no clear path to how AI can become sentient. Part of me, I hope that there will be a technological breakthrough that enables AI to become sentient, but I just don’t see it happening. It might be that that breakthrough might happen in decades. It might happen in hundreds of years. Maybe it’ll happen thousands of years. I don’t know. I really don’t know.
The timing of technology breakthroughs is very hard to predict. I once made this analogy that worrying about evil AI killer robots today is a little bit like worrying about overpopulation on the planet Mars. I do hope that someday we’ll colonize Mars and maybe someday Mars will be overpopulated. And some will ask me “Andrew there are all these young, innocent children dying of pollution on Mars, how can you not care about them?” And my answer is “We haven’t landed on the planet yet, so I don’t know how to work productively on that problem.” If you ask me, do I support doing research on x, do I support research on almost any subject, I usually want to say yes, of course. Research on anti evil AI is a positive thing. But I do see that there’s a massive misallocation of sources. I think if there were two people in United States, maybe ten people in United States who were going anti evil A.I., it’s fine. The ten people working on over population of Mars is actually fine, form a committee, write some papers. But I do think that there is much too much investment in this right now, right? Sleep easy.
Societal Impact of AI
Quite a lot of you asked about the societal impact, which what I found is varying. The other thing I worry about is this evil AI hype being used to whitewash a much more serious issue, which is job displacement. I know a lot of leaders in machine learning. I talk to them about their project. There’s so many jobs that are squarely in the cross hairs of my friends’ projects, and the people doing those jobs, frankly, they just don’t know. In Silicon Valley, we’re being responsible for creating tremendous wealth, but part of me feels like we need to be responsible as well for owning up to the problems we cause and I think job displacement is the next big one. We shouldn’t whitewash this issue by pretending that there’s some other futuristic fear, to fearmonger about and try to solve that by ignoring the real problem.
AI Product Management
The last thing I want to talk about is, AI product management. AI is evolving rapidly, super exciting, they’re just opportunities left and right, but I want to share with you some of the challenges I see as well. Some of the things we’re working are at the bleeding edge, I feel like our own thinking is not yet mature, but that you’ll run into if you try to incorporate AI into business. AI Product Management. Maybe many of you know what a PM is, but let me just draw for you a Venn diagram. That’s my simple model of how PMs and engineers should work together.
Let’s say this is the set of all things that users will love. The set of all possible things, all the possible products that users will love. And this is a set of all things that are feasible, meaning that today’s technology or technology now or the near future enables us to build this. For example, I would love a teleportation device, but I don’t think that’s technologically feasible, so teleportation device will be here, but we’ll all love one, but I don’t think it’s feasible. There are a lot of things that are feasible but then no one wants. We build a lot of those in Silicon Valley as well. I think the secret is to try to find something in the middle. I think of the PM’s job as figuring out what is this set on the left, and research engineering’s job as figuring out what’s in this right side. And then the two kind of work together to built something that’s actually in the intersection.
Now, one of the challenges is that AI is such a new thing that the work flows and processors that we’re used to in tech companies, they’re not quite working for AI tools. For example, in Silicon Valley we have pretty well established processors, product managers and engineers and engineer to do their work. For example, for a lot of apps the product manager will draw a wire frame.
For example, for the search app, the PM might decide we’ll put a logo there, put a search bar there, put a microphone there, put a camera there, and then put a news feed here, and then actually, we’ll we actually move our microphone button down here and we’ll have a social button. A product manager would draw this on a piece of paper or he CAD thing, and an engineer would look at this drawing that the product manager drew, and they would write a piece of software and this is actually a rough for the Baidu search. The search button in terms of news here. Baidu combines the search as well as a social newsfeed. Not very social, a newsfeed, both in one. If you pull open your app or you build a lot of apps like a news app or a social feeds app or whatever, this type of working together works with established process of doing this. But how about an AI app? You can’t wire frame a self-driving car that runs by wire frame from a self driving car or if you want to build a speech recognition system. The PM draws this button, but I don’t know how good, how accurate, does my speech recognition system need to be. What if this wire frame was a way for the PM and the engineer to communicate? We are still frankly trying to figure out what are good ways for a PM and an engineer to communicate a shared vision of what a product should be. Does that make sense? PM does a lot of work, goes out, figures out what’s important to users and they have in their head some idea what this product should be. But how do they communicate that to the engineer?
Speech Recognition System
- Low bandwidth audio
- Accented speech
Let’s say that you’re trying to build a speech recognition system. I do know how to work on speech recognition. My team and I, they all work on speech recognition so we talk about that a lot. If you’re trying to build a speech recognition system, say to enable voice search, there a lot of ways to improve the speech recognition system.
Maybe you want it to work better even in noisy environments, right? But a noisy environment, it could mean car environment, or it could mean a cafe environment, people talking versus a car noise, a highway pursuit. Or maybe you really need it to work on low bandwidth audio. Maybe sometimes users are just in a bad cell phone coverage setting, so you need it to work better on low bandwidth audio. Or maybe you need it to work better on accented speech. I guess US has a lot of accents. China also has a lot of accents. What does accented speech mean? Does it mean a European accent, or Asian accent? European does it mean British, or Scottish? You know what does accent really mean? Or maybe you really care about something else.
One of the practices we’ve come up with, is that one of the good ways for a PM to communicate with an engineer, is through data, and what I mean is for many of my projects we ask the PM to be responsible for coming up with a data set. For example, give me, let me say 10,000 audio clips that really show me what your really care about. If the PM comes up with thousand or ten thousand examples of people recordings of a speech, and give us data to the engineer, and the engineer has a clear target to aim for. We found that having a PM responsible for collecting really a test set is one of the most effective processes for letting the PM specify what they really care about. If all 10,000 audio clips have a lot of car noise, this is a clear way to communicate to the engineer that you really care about car noise. If it’s a mix of these different things, then it communicates to an engineer how exactly, what I mix of these different phenomena the PM wants you to optimize for. I have to say, this is one of those things that’s obvious in hindsight, but that surprisingly few AI teams do this. One of the bad practices I’ve seen is when the PM gives an engineer 10,000 audio clips, but they actually care about a totally different 10,000 ones. That happens surprisingly often in multiple companies. I feel like we’re still in the process of advancing the bleeding edge of these workflow processes for how to think about new products.
I’ve done a lot of work on conversational agents. I might Say to the AI “may you please order takeout for me?”, and then the AI says “well what restaurant do you want to order from?” And you’d say “I feel like a hamburger.” So you’d go back and forth like a conversation or a chat bot to help you order food or whatever. If you were to draw a wire frame, the wire frame would be while you say this, the chat box says this, you say this chat box says this, but this is not a good spec for the AI right?
The wireframe is the easy part, the visual design, you can do that, but how intelligent is this really supposed to be? So the process that we developed by doing this, we asked the PM and the engineer to sit down together and write out 50 conversations that the chat box is meant to have with you. For example, if you sit down and write the following. Let’s say the user, U for user, says, “Please book a restaurant for my anniversary next Monday.”
U: Please book a restaurant…Anniversary…Monday
AI: Okay. Do you want flowers?
The PM then says, well in this case I want the AI to say, “OK, and do you want me to order flowers?” What we found is that this then creates a conversation between the PM and the engineer where the engineer asks a PM, “wait, do you want me to suggest an appropriate gift for all circumstances or is it only for anniversaries you want to buy flowers and I don’t have to buy any other gift and nothing anything other than on anniversaries.” Then we found then the process of writing out 50 conversations between consulate agents and engineer PMs who work through these conversations, that those are good processes to enable the PM to specify what they think is the set on the left, and for the engineer to tell the PM what the engineer thinks is feasible given today’s chat box technology. This is actually a process that we’re using in multiple products, so I think that AI technology is advancing rapidly and there’s so many shiny things in AI.
The things you see the most in PR are often the shiniest technology but the shiniest technology is often not the most useful. But I think that’s we’re still missing a lot of the downstream parts of the value chain of how to take the shiny AI technology that we find out in research papers. Software engineering today has established processes like code review and you know agile development. Some of you know what those are, right? But these were established processes for writing code. I think we’re still in the early phases of trying to figure out how to organize the work of AI and the work of AI product.
Speech Recognition Take Off
This is actually a very exciting time to enter this field. I want to share with you some specific examples of short time opportunities of AI. These are things that are coming in the very near future. I mentioned fintech, I’m not going to talk about that. In the near term future, I think speech recognition will take off. It’s just in the last year or two that speech recognition reached the level of accuracy, was becoming incredibly useful.
So about four, five months ago, there was a Stanford University led study done by James Landay who is a professor of Computer Science, together with us at Baidu and the University of Washington, that showed that speech input on this cellphone is 3x faster using speech recognition than typing on the cell phone. Speech recognition has passed the accuracy threshold where you actually are much faster and much more efficient using speech recognition than typing on the cell phone keyboard, and that’s true for English and Chinese.
At Baidu over the past year we saw 100% year on year growth on the use of speech recognition across all of our properties. I think we’re beyond the knee of the curve where speech recognition will take off rapidly. In the U.S. there are multiple companies doing smart speakers. Baidu has a different vision. I think that a device that you can command with your voice in your home will also take off rapidly. An operating system used on home hardware would enable that.
Computer Vision — Face Recognition
Computer vision is coming little bit later. I see some things take off faster in China than the US. Because all of us living in the US are familiar with US I might share more thing things that I see from China. One thing that is taking off very rapidly is Face Recognition. China is a mobile first society. Most of us in U.S. first were on the laptop or a desktop, then we got our smartphone. Lot of people in China really just have a smartphone or first get a smartphone then a laptop or a desktop. I’m not sure who buys desktops anymore. Because of that in China you can apply for an educational loan on your cellphone. And just based on buttons, just based on using your cellphone, we will send you a lot of money for your education. Because of these very material, financial transactions are happening over your cellphone, before we send you a lot of money we would really like to verify that you are who you say you are before we send it to someone that claims to be you but isn’t you. This in turn has driven a lot of pressure for progress and face recognition, and so face recognition on mobile devices as a means of biometric identity verification is taking off in China.
Today in Baidu headquarters instead of having to swipe an ID card to get inside the office building, today at Baidu’s headquarters, I can just walk up and there’s a face recognition just to recognize my face, and I just walk right through. Just yesterday or the day before, I posted a video on my personal YouTube channel demoing this. You can look that up later if you want. But we now have face recognition systems that are good enough that we trust it with pretty security critical applications. If you look just like me, you can actually get inside my office at Baidu. We really trust our face recognition system, so it’s pretty easy. I think both of these have been obvious to us for some time, so our capital investment and investments have been massive. These are well beyond the point where a small group could be competitive with us unless there’s some unexpected technological breakthrough.
AI Effect on Health Care
I’m personally very bullish about the impact of AI on healthcare. I’ve spent quite a bit of time on this myself. The obvious one that a lot of people talk about is medical imaging. I do find it challenging. I do think that a lot of radiologists that are graduating today, will be impacted by AI, definitely, sometime in the course of their careers. If you’re planning for a 40-year career in radiology, I would say that’s not a good plan. But beyond radiology, I think that there are many other verticals, some of which we’re working on, but there’s a huge opportunity there.
Transforming Industries Through Supervised Learning
Fintech is there. I hope education will get there, but I think education has other things to solve before reading these issues impact by AI, but I really think that AI will be an incredibly impactful in many different verticals. What I talked about today was kind of AI technology today, so really supervised learning. I will say that the transformation of all of these industries, there’s already a relatively clear road map for how to transform multiple industries using just supervised learning.
There are researchers working on even other forms of AI, you might hear one say unsupervised learning or reinforcement learning or transfer learning, there are other forms of AI as well that maybe don’t need as much data or maybe has other advantages. Most of those are in the research phase, most of them are used in relatively small ways, they’re not what’s driving economic value today, but many of us hope that there will a breakthrough in this other areas and if that comes to pass, then that will unlock additional ways of value. The field that AI has had several winters before. I think the field overhype went down. We think there were maybe two winters an AI, right, but many disciplines undergo a few winters, winter and then eternal spring, and I actually think that AI has passed into the phase of eternal spring.
I think one of the questions someone asked, when will AI no longer be the top technology or something and I feel like if you look at silicon and technology, I think we’re at the eternal spring of silicon technology, or maybe some other metal, some other material will surpass it, but the concept of a transistor and computational circuits, that seems like it’s going to be with the human race for a long time. And I think we have reached that point for AI where AI, new networks, deep learning, I think it will be with us for a long time. Completely conscious of yourself, but they could be a very long time, because it’s creating so much value already and because there is this clear road map for transforming several industries even with the ideas we have, but hopefully there will be even more breakthroughs and even more of these technologies.
Job Displacement Issue
You know the jobs issue, I think that to the extent that we’re causing these problems. We should own up to the job displacement issue. Just as AI displaces jobs, similar to the earlier ways of job displacement, I think that AI will create new jobs as well, maybe even ones we can’t imagine. I think one of the biggest challenges of education is motivation. It is really good for you to take these courses and study, but it’s actually really difficult for an individual to find the time, and the space, and the energy to do the learning that gives them these long-term benefits.
After the automation replaced a lot of agriculture, the United States built its current educational system, your K-12 and university. It was a lot of work to build the world’s current educational system. With AI displacing a lot of jobs I’m confident that there will be new jobs but I think also we need a new educational system to help people whose jobs are displaced reskill themselves to take on the new jobs. One of the things that we should move toward is a model of basic income but not universal basic income where, you’re paid to “do nothing”, but I think government should give people a safety net, but pay the unemployed to study, to provide the structure to help the unemployed to study so as to increase the odds of gaining the skills needed to re-enter the workforce and contribute back to the tax base that is paying for all this on a basic income.
I think we need a new, new deal in order to evolve society towards this new world where there are new jobs, but job displacements are also happening faster than before, and they have been saying more about that.
GSB (Graduate School of Business) and CS (Computer Science)
I know that often hearing the GSB, many of you have fantastic product business, or social change ideas, one of the things I hope to do is try to connect, frankly connect GSB and CS. I think that GSB and CS are really complimentary sense of expertise, but for various complicated reasons that we get into, the two communities don’t seem very connected.
I’m in the process of organizing some events that I hope will bring together some CS, some GSB, maybe also some VC, some capital investments to those of you interested in exploring new opportunities that AI creates. So if you want to be informed of that, sign up for this mailing list at bit.ly/gsb-ai. There are some things being organized. They’re already underway, but actually instead of taking a picture of this, if you just go and sign up for this on your cellphone, right now. You can do this while I’m taking questions. And some of these things are already underway, but when they’re ready to be announced, I’ll announce it to the mailing list there, so that you can come in and be connected to some of these other pieces at the campus. So with that, I’m happy to take questions, but let me say thank you all very much.