If you want to build artificial intelligence into every product, you better retrain your army of coders. Check.
Carson Holgate is training to become a ninja.
Not in the martial arts — she’s already done that. Holgate, 26, holds a second degree black belt in Tae Kwon Do. This time it’s algorithmic. Holgate is several weeks into a program that will inculcate her in an even more powerful practice than physical combat: machine learning, or ML. A Google engineer in the Android division, Holgate is one of 18 programmers in this year’s Machine Learning Ninja Program, which pulls talented coders from their teams to participate, Ender’s Game-style, in a regimen that teaches them the artificial intelligence techniques that will make their products smarter. Even if it makes the software they create harder to understand.
“The tagline is, Do you want to be a machine learning ninja?” says Christine Robson, a product manager for Google’s internal machine learning efforts, who helps administer the program. “So we invite folks from around Google to come and spend six months embedded with the machine learning team, sitting right next to a mentor, working on machine learning for six months, doing some project, getting it launched and learning a lot.”
For Holgate, who came to Google almost four years ago with a degree in computer science and math, it’s a chance to master the hottest paradigm of the software world: using learning algorithms (“learners”) and tons of data to “teach” software to accomplish its tasks. For many years, machine learning was considered a specialty, limited to an elite few. That era is over, as recent results indicate that machine learning, powered by “neural nets” that emulate the way a biological brain operates, is the true path towards imbuing computers with the powers of humans, and in some cases, super humans. Google is committed to expanding that elite within its walls, with the hope of making it the norm. For engineers like Holgate, the ninja program is a chance to leap to the forefront of the effort, learning from the best of the best. “These people are building ridiculous models and have PhD’s,” she says, unable to mask the awe in her voice. She’s even gotten over the fact that she is actually in a program that calls its students “ninjas.” “At first, I cringed, but I learned to accept it,” she says.
Considering the vast size of Google’s workforce — probably almost half of its 60,000 headcount are engineers — this is a tiny project. But the program symbolizes a cognitive shift in the company. Though machine learning has long been part of Google’s technology — and Google has been a leader in hiring experts in the field — the company circa 2016 is obsessed with it. In an earnings call late last year, CEO Sundar Pichai laid out the corporate mindset: “Machine learning is a core, transformative way by which we’re rethinking how we’re doing everything. We are thoughtfully applying it across all our products, be it search, ads, YouTube, or Play. And we’re in early days, but you will see us — in a systematic way — apply machine learning in all these areas.”
Obviously, if Google is to build machine learning in all its products, it needs engineers who have mastery of those techniques, which represents a sharp fork from traditional coding. As Pedro Domingos, author of the popular ML manifesto The Master Algorithm, writes, “Machine learning is something new under the sun: a technology that builds itself.” Writing such systems involves identifying the right data, choosing the right algorithmic approach, and making sure you build the right conditions for success. And then (this is hard for coders) trusting the systems to do the work.
“The more people who think about solving problems in this way, the better we’ll be,” says a leader in the firm’s ML effort, Jeff Dean, who is to software at Google as Tom Brady is to quarterbacking in the NFL. Today, he estimates that of Google’s 25,000 engineers, only a “few thousand” are proficient in machine learning. Maybe ten percent. He’d like that to be closer to a hundred percent. “It would be great to have every engineer have at least some amount of knowledge of machine learning,” he says.
Does he think that will happen?
“We’re going to try,” he says.
For years, John Giannandrea has been Google’s key promoter of machine learning, and, in a flashing neon sign of where the company is now, he recently became the head of Search. But when he arrived at the company in 2010 (as part of the company’s acquisition of MetaWeb, a vast database of people, places and things that is now integrated into Google Search as the Knowledge Graph), he didn’t have much experience with ML or neural nets. Around 2011, though, he became struck by news coming from a conference called Neural Information Processing Systems (NIPS). It seemed every year at NIPS some team or other would announce results using machine learning that blew away previous attempts at a solving a problem, be it translation, voice recognition, or vision. Something amazing was happening. “When I was first looking at it, this NIPS conference was obscure,” he says. “But this whole area across academia and industry has ballooned in the last three years. I think last year 6000 attended.”
These improved neural-net algorithms along with more powerful computation from the Moore’s Law effect and an exponential increase in data drawn from the behavior of huge numbers of users at companies like Google and Facebook, began a new era of ascendant machine learning. Giannandrea joined those who believed it should be central to the company. That cohort included Dean, co-founder of the Google Brain , a neural net project originating in the company’s long-range research division Google X. (Now known simply as X.)
Google’s bear-hug-level embrace of machine learning does not simply represent a shift in programming technique. It’s a serious commitment to techniques that will bestow hitherto unattainable powers to computers. The leading edge of this are “deep learning” algorithms built around sophisticated neural nets inspired by brain architecture. Google Brain is a deep learning effort, and DeepMind, the AI company Google bought for a reported $500 million in January 2014, also concentrates on that end of the spectrum. It was DeepMind that created the AlphaGo system that beat a champion of Go, shattering expectations of intelligent machine performance and sending ripples of concern among those fearful of smart machines and killer robots.
While Giannandrea dismisses the “AI-is-going-to-kill us” camp as ill-informed Cassandras, he does contend that machine learning systems are going to be transformative, in everything from medical diagnoses to driving our cars. While machine learning won’t replace humans, it will change humanity.
The example Giannandrea cites to demonstrate machine learning power is Google Photos, a product whose definitive feature is an uncanny — maybe even disturbing — ability to locate an image of something specified by the user. Show me pictures of border collies. “When people see that for the first time they think something different is happening because the computer is not just computing a preference for you or suggesting a video for you to watch,” says Giannandrea. “It’s actually understanding what’s in the picture.” He explains that through the learning process, the computer “knows” what a border collie looks like, and it will find pictures of it when it’s a puppy, when its old, when it’s long-haired, and when it’s been shorn. A person could do that, of course. But no human could sort through a million examples and simultaneously identify ten thousand dog breeds. But a machine learning system can. If it learns one breed, it can use the same technique to identify the other 9999 using the same technique. “That’s really what’s new here,” says Giannandrea. “For those narrow domains, you’re seeing what some people call super human performance in these learned systems.”
To be sure, machine learning concepts have long been understood at Google, whose founders are lifetime believers of the power of artificial intelligence. Machine learning is already baked into many Google products, albeit not always the more recent flavors centering around neural nets. (Earlier machine learning often relied on a more straightforward statistical approach.)
In fact, over a decade ago, Google was running in-house courses to teach its engineers machine learning. In the early 2005, Peter Norvig, then in charge of search, suggested to a research scientist named David Pablo Cohn that he look into whether Google might adopt a course in the subject organized by Carnegie Mellon University. Cohn concluded that only Googlers themselves could teach such an internal course, because Google operated at such a different scale than anyone else (except maybe the Department of Defense). So he reserved a large room in Building 43 (then the headquarters of the search team) and held a two-hour class every Wednesday. Even Jeff Dean dropped in for a couple of sessions. “It was the best class in the world,” Cohn says. “They were all much better engineers than I was!” The course was so popular, in fact, that it began to get out of hand. People in the Bangalore office were staying past midnight so they could call in. After a couple of years, some Googlers helped put the lectures on short videos; the live sessions ended. Cohn believes it might have qualified as a precursor to the Massive Open Online Course (MOOC). Over the next few years there were other disparate efforts at ML training at Google, but not in an organized, coherent fashion. Cohn left Google in 2013 just before, he says, ML at Google “suddenly became this all-important thing.”
That understanding hadn’t yet hit in 2012 when Giannandrea had the idea to “get a bunch of people who were doing this stuff” and put them in a single building. Google Brain, which had “graduated” from the X division, joined the party. “We uprooted a bunch of teams, put them in a building, got a nice new coffee machine,” he says. “People who previously had just been working on what we called perception — sound and speech understanding and so on — were now talking to the people who were trying to work on language.”
More and more, the machine learning efforts from those engineers began appearing in Google’s popular products. Since key machine learning domains are vision, speech, voice recognition, and translation, it’s unsurprising that ML is now a big part of Voice Search, Translate, and Photos. More striking is the effort to work machine learning into everything. Jeff Dean says that as he and his team have begun to understand ML more, they are exploiting it in more ambitious ways. “Previously, we might use machine learning in a few sub-components of a system,” he says. “Now we actually use machine learning to replace entire sets of systems, rather than trying to make a better machine learning model for each of the pieces.” If he were to rewrite Google’s infrastructure today, says Dean, who is known as the co-creator of game-changing systems like Big Table and MapReduce, much of it would not be coded but learned.
Machine learning also is enabling product features that previously would have been unimaginable. One example is Smart Reply in Gmail, launched in November 2015. It began with a conversation between Greg Corrado, a co-founder of the Google Brain project, and a Gmail engineer named Bálint Miklós. Corrado had previously worked with the Gmail team on using ML algorithms for spam detection and classifying email, but Miklós suggested something radical. What if the team used machine learning to automatically generate replies to emails, saving mobile users the hassle of tapping out answers on those tiny keyboards? “I was actually flabbergasted because the suggestion seemed so crazy,” says Corrado. “But then I thought that with the predictive neural net technology we’d been working on, it might be possible. And once we realized there was even a chance, we had to try.”
Google boosted the odds by keeping Corrado and his team in close and constant contact with the Gmail group, an approach that is increasingly common as machine learning experts fan out among product groups. “Machine learning is as much art as it is science,” says Corrado. “It’s like cooking — yes, there’s chemistry involved but to do something really interesting, you have to learn how to combine the ingredients available to you.”
Traditional AI methods of language understanding depended on embedding rules of language into a system, but in this project, as with all modern machine learning, the system was fed enough data to learn on its own, just as a child would. “I didn’t learn to talk from a linguist, I learned to talk from hearing other people talk,” says Corrado. But what made Smart Reply really feasible was that success could be easily defined — the idea wasn’t to create a virtual Scarlett Johansson who would engage in flirtatious chatter, but plausible replies to real-life emails. “What success looked like is that the machine generated a candidate response that people found useful enough to use as their real response,” he says. Thus the system could be trained by noting whether or not users actually clicked on the suggested replies.
When the team began testing Smart Reply, though, users noted a weird quirk: it would often suggest inappropriate romantic responses. “One of the failure modes was this really hysterical tendency for it to say, ‘I love you’ whenever it got confused,” says Corrado. “It wasn’t a software bug — it was an error in what we asked it to do.” The program had somehow learned a subtle aspect of human behavior: “If you’re cornered, saying, ‘I love you’ is a good defensive strategy.” Corrado was able to help the team tamp down the ardor.
Smart Reply, released last November, is a hit — users of the Gmail Inbox app now routinely get a choice of three potential replies to emails that they can dash off with a single touch. Often they seem uncannily on the mark. Of responses sent by mobile Inbox users, one in ten is created by the machine-learning system. “It’s still kind of surprising to me that it works,” says Corrado with a laugh.
Smart Reply is only one data point in a dense graph of instances where ML has proved effective at Google. But perhaps the ultimate turning point came when machine learning became an integral part of search, Google’s flagship product and the font of virtually all its revenues. Search has always been based on artificial intelligence to some degree. But for many years, the company’s most sacred algorithms, those that delivered what were once known as the “ten blue links” in response to a search query, were deemed too important for ML’s learning algorithms. “Because search is such a large part of the company, ranking is very, very highly evolved, and there was a lot of skepticism you could move the needle very much,” says Giannandrea.
In part this was a cultural resistance — a stubborn microcosm of the general challenge of getting control-freaky master hackers to adopt the Zen-ish machine learning approach. Amit Singhal, the long-time maester of search, was himself an acolyte of Gerald Salton, a legendary computer scientist whose pioneering work in document retrieval inspired Singhal to help revise the grad-student code of Brin and Page into something that could scale in the modern web era. (This put him in the school of “retrievers.”) He teased amazing results from those 20th century methods, and was suspicious of integrating learners into the complicated system that was Google’s lifeblood. “My first two years at Google I was in search quality, trying to use machine learning to improve ranking,” says David Pablo Cohn. “It turns out that Amit’s intuition was the best in the world, and we did better by trying to hard code whatever was in Amit’s brain. We couldn’t find anything as good as his approach.”
By early 2014, Google’s machine learning masters believed that should change. “We had a series of discussions with the ranking team,” says Dean. “We said we should at least try this and see, is there any gain to be had.” The experiment his team had in mind turned out to be central to search: how well a document in the ranking matches a query (as measured by whether the user clicks on it). “We sort of just said, let’s try to compute this extra score from the neural net and see if that’s a useful score.”
It turned out the answer was yes, and the system is now part of search, known as RankBrain. It went online in April 2015. Google is characteristically fuzzy on exactly how it improves search (something to do with the long tail? Better interpretation of ambiguous requests?) but Dean says that RankBrain is “involved in every query,” and affects the actual rankings “probably not in every query but in a lot of queries.” What’s more, it’s hugely effective. Of the hundreds of “signals” Google search uses when it calculates its rankings (a signal might be the user’s geographical location, or whether the headline on a page matches the text in the query), RankBrain is now rated as the third most useful.
“It was significant to the company that we were successful in making search better with machine learning,” says Giannandrea. “That caused a lot of people to pay attention.” Pedro Domingos, the University of Washington professor who wrote The Master Algorithm, puts it a different way: “There was always this battle between the retrievers and the machine learning people,” he says. “The machine learners have finally won the battle.”
Google’s new challenge is shifting its engineering workforce so everyone is familiar, if not adept, at machine learning. It’s a goal pursued now by many other companies, notably Facebook, which is just as gaga about ML and deep learning as Google is. The competition to hire recent graduates in the field is fierce, and Google tries hard to maintain its early lead; for years, the joke in academia was that Google hires top students even when it doesn’t need them, just to deny them to the competition. (The joke misses the point that Google does need them.) “My students, no matter who, always get an offer from Google.” says Domingos. And things are getting tougher: just last week, Google announced it will open a brand new machine-learning research lab in Zurich, with a whole lot of workstations to fill.
But since academic programs are not yet producing ML experts in huge numbers, retraining workers is a necessity. And that isn’t always easy, especially at a company like Google, with many world-class engineers who have spent a lifetime achieving wizardry through traditional coding.
Machine learning requires a different mindset. People who are master coders often become that way because they thrive on the total control that one can have by programming a system. Machine learning also requires a grasp of certain kinds of math and statistics, which many coders, even gonzo hackers who can zip off tight programs of brobdingnagian length, never bothered to learn.
It also requires a degree of patience. “The machine learning model is not a static piece of code — you're constantly feeding it data,” says Robson. “We are constantly updating the models and learning, adding more data and tweaking how we're going to make predictions. It feels like a living, breathing thing. It’s a different kind of engineering.”
“It’s a discipline really of doing experimentation with the different algorithms, or about which sets of training data work really well for your use case,” says Giannandrea, who despite his new role as search czar still considers evangelizing machine learning internally as part of his job. “The computer science part doesn’t go away. But there is more of a focus on mathematics and statistics and less of a focus on writing half a million lines of code.”
As far as Google is concerned, this hurdle can be leapt over by smart re-training. “At the end of the day the mathematics used in these models is not that sophisticated,” says Dean. “It’s achievable for most software engineers we would hire at Google.”
To further aid its growing cadre of machine learning experts, Google has built a powerful set of tools to help engineers make the right choices of the models they use to train their algorithms and to expedite the process of training and refining. The most powerful of those is TensorFlow, a system that expedites the process of constructing neural nets. Built out of the Google Brain project, and co-invented by Dean and his colleague Rajat Monga, TensorFlow helped democratize machine learning by standardizing the often tedious and esoteric details involved in building a system — especially since Google made it available to the public in November 2015.
While Google takes pains to couch the move as an altruistic boon to the community, it also acknowledges that a new generation of programmers familiar with its in-house machine learning tools is a pretty good thing for Google recruiting. (Skeptics have noted that Google’s open-sourcing TensorFlow is a catch-up move with Facebook, which publicly released deep-learning modules for an earlier ML system, Torch, in January 2015.) Still, TensorFlow’s features, along with the Google imprimatur, have rapidly made it a favorite in ML programming circles. According to Giannandrea, when Google offered its first online TensorFlow course, 75,000 people signed up.
Google still saves plenty of goodies for its own programmers. Internally, the company has a probably unparalleled tool chest of ML prosthetics, not the least of which is an innovation it has been using for years but announced only recently — the Tensor Processing Unit. This is a microprocessor chip optimized for the specific quirks of running machine language programs, similar to the way as Graphics Processing Units are designed with the single purpose of speeding the calculations that throw pixels on a display screen. Many thousands (only God and Larry Page probably know how many) are inside servers in the company’s huge data centers. By super-powering its neural net operations, TPU’s give Google a tremendous advantage. “We could not have done RankBrain without it,” says Dean.
But since Google’s biggest need is people to design and refine these systems, just as the company is working feverishly to refine its software-training tools, it’s madly honing its experiments in training machine-learning engineers. They range from small to large. The latter category includes quick-and-dirty two-day “Machine Learning Crash Course with TensorFlow,” with slides and exercises. Google hopes this is a first taste, and the engineers will subsequently seek out resources to learn more. “We have thousands of people signed up for the next offering of this one course,” says Dean.
Other, smaller efforts draw outsiders into Google’s machine learning maw. Earlier this spring, Google began the Brain Residency program, a program to bring in promising outsiders for a year of intense training from within the Google Brain group. “We’re calling it a jump start in your Deep Learning career,” says Robson, who helps administer the program. Though it’s possible that some of the 27 machine-learning-nauts from different disciplines in the initial program might wind up sticking around Google, the stated purpose of the class is to dispatch them back in the wild, using their superpowers to spread Google’s version of machine learning throughout the data-sphere.
So, in a sense, what Carson Holgate learns in her ninja program is central to how Google plans to maintain its dominance as an AI-focused company in a world where machine learning is taking center stage.
Her program began with a four-week boot camp where the product leads of Google’s most advanced AI projects drilled them on the fine points of baking machine learning into projects. “We throw the ninjas into a conference room and Greg Corrado is there at the white board, explaining LSTM [“Long Short Term Memory,” a technique that makes powerful neural nets], gesturing wildly, showing how this really works, what the math is, how to use it in production,” says Robson. “We basically just do this with every technique we have and every tool in our toolbox for the first four weeks to give them a really immersive dive.”
Holgate has survived boot camp and now is using machine learning tools to build a communications feature in Android that will help Googlers communicate with each other. She’s tuning hyperparameters. She’s cleansing her input data. She’s stripping out the stop words. But there’s no way she’s turning back, because she knows that these artificial intelligence techniques are the present and the future of Google, maybe of all technology. Maybe of everything.
“Machine learning,” she says, “is huge here.”
Photography by Jason Henry. Creative Art Direction by Redindhi.