Google’s flagship product has been part of our lives for so long that we take it for granted. But Google doesn’t. Part One of a study of Search’s quiet transformation.
Why is the sky blue?
Children often ask this question, and very few parents are able to provide the answer unaided. Not too long ago, finding the correct reply would have taken, at the very least, a dive into the encyclopedia, maybe even a trip to the library. In more recent years, moms and dads have simply rushed to the computer, fired up Google, assessed the links presented in response to the question, quickly read through the explanations, and parsed it so their rug rat could understand it.
But in 2015, even that seemingly expedited process won’t do. For one thing, questions are less likely to be typed into a search field than dictated to a mobile device. And while selecting the most relevant of a ranked list of links remains a valid approach for certain queries, people who ask questions with well-defined answers — like this one — have learned to expect answers right now. They are disappointed, even angry, when Google cannot provide it.
So . . . . “Okay, Google….why is the sky blue?”
It takes less than a second for an Android phone to respond to this spoken query in an intelligible, but obviously automated voice,
“A clear cloudless day-time sky is blue because molecules in the air scatter blue light from the sun more than they scatter red light.”
Amit Singhal, who heads Google’s search team, uses this example to help illustrate how the world’s most popular — by far — search engine has transformed itself in the past few years. In interviews I’ve conducted with him over the years, Singhal has analogized making a major change in Google search to switching jet engines mid-air, as Google alters its algorithmic flight plan to refine its rankings, to add new corpora of information (like images, books, or travel), or to begin searching for queries before users finish typing them. In the past few years, though, Google changed not only the engines but much of the cockpit. The inexorable momentum of mobile — 2015 will probably be the year when people conduct more searches on phones and tablets than on the desktop — forced something bigger: a reconsideration of the entire mission in light of that sea change.
“We have all had to take a deep look at what search really means in a world that has gone mobile,” he says. “Our heads explode when we think about this.”
For its entire 17-year history, Google Search has always been evolving, a process that the company often marks with celebratory blog items and an occasional press event. (Though when it comes to quantifying the changes, Google reverts to its usual dogged stinginess — for years, it has described the number of hints it uses to help rank search queries, known as “signals,” as “more than 200.”) Because Search remains the company’s flagship product — and the platform supporting the ads that are still Google’s dominant source of income — Google has never been complacent about improving the way that over a billion people find information in the course of a day. But in the past few years, the pace has accelerated, both in short- and long-term efforts, to keep Google ahead of its competitors.
Users can’t miss some of the changes. Search is faster, it’s fresher, and it’s more social (though after a big push on “social search” around the time Google Plus was launching, you don’t hear much about that now). Search even looks different. “In [earlier] days there was a lot less going on — we had the home page and search results,” says Tamar Yehoshua, one of the executives responsible for what Google calls the search experience. “Today we have lots of different features and products on the search results page.”
Looking ahead, Google has been pushing the frontiers of artificial intelligence to build a giant “brain” that will be able to much better understand both its users and the world, delivering on-target results even before people think to ask about it. (More on this later in this series.)
Yet some critics argue that Google search is on a downward slope. They gripe about too many spam results, or an overemphasis on newer information that occludes more relevant earlier results. You hear that the vaunted “ten blue links” have been polluted by a confusing and self-serving plethora of features like shopping, news, and multimedia results. (On the other hand, Google’s biggest US competitor, Microsoft’s Bing, crows that Google still has too much reliance on the ten blue links.) A Buzzfeed headline last year bleated, “We Are Entering the Worst Period in Modern History,” followed by the outright claim that “Google is becoming less useful.”
Singhal fiercely disputes this charge. “The truth is completely reverse,” he says. “I’ve looked into [these complaints] and I’ve discovered there is some nostalgia that they have for the past. Our search today is far better than it was last year or two years before.”
Singhal’s comments are indicative of the pride and confidence among key people in Google search these days. Only a few years ago, despite Google’s conviction that its search quality was incomparable, there were real fears that the company’s dominance might weaken. Google was in the throes of Facebook panic. “We don’t have those connections,” Singhal told me in 2011, clearly referring to Facebook’s network, which forbade Google’s crawlers. “I don’t know how information flows in those networks.” In the height of this mania, Singhal took on something of a Cassandra role at Google, at one point collaring the company’s head of social, Vic Gundotra, to unleash a self-described rant about how closed networks might threaten Google’s existence. If you suggested to Singhal that the threat did not square with Google’s unchallenged power in search, he had an answer to that: “Pan Am seemed pretty powerful when I was a little child, too,” he told me in that conversation.
But the fears turned out to be overblown; it’s unlikely that someone at Google these days would analogize the company to the failed airline that was once the world’s premier carrier. Facebook’s Graph Search, while still a nascent product, is off to a slow start and has little impact on Google. Bing, while Microsoft has made it a respectable competitor in search quality, still holds less than a fifth of the market. While Google Plus fell far short of the company’s attempt to finally create a blockbuster social networking product, it did succeed in getting many more search users signed in.
Instead of an Internet closed to Google by a single powerful competitor, the threat to search now appears to be an exodus from the web to a variegated archipelago of apps. (The movement to mobile also presents challenges to Google’s search ad revenue, but that’s a story for another series.) Google sees the rise of information within apps as something it can overcome — after all, mobile developers, like webmasters, want their information discoverable. Since the fall of 2013, Google has set up an App Indexing effort to encompass data inside mobile apps into its general index. Fifteen percent searches from signed-in Android users now yield results with information inside apps. Apps indexing, though, does not currently include iOS apps, a serious gap. “There’s still a long way to go, says Lawrence Chang, product manager for Apps Indexing. “But we’re building the fundamental blocks.”
But for now, the challenge of crawling the apps universe hasn’t affected Google’s search dominance. The statistics remain staggering. Google accepts over 3 billion search queries a day. In the US, two-thirds of all searches use Google — worldwide, there’s similar dominance. (A recent dip in market share is largely attributed not to search quality , but Yahoo’s deal to dislodge Google as the default search engine on Firefox). Even more impressive, Google hosts well over 80% of mobile searches. When Google suffered a five-minute outage in 2013, global web traffic dropped forty percent.
No search competitor has Google’s infrastructure, its deep talent, or its experience. Few have its ambition. So while news coverage of Google has dealt with regulatory issues, the fortunes and misfortunes of Glass, and the adolescent superstars of YouTube, Search has been going through a steady but intense reinvention.
In some ways, the changes are simply a continuation of the way Google has been evolving search since the beginning. On a micro basis, Google makes subtle changes in its algorithm, blessing the adjustments in its weekly search quality launch meeting. Then, every two or three years, there’s a major dustup in the ranking system, creating winners and losers among businesses scuffling to be highly associated with important keywords. The most recent of these was 2013’s update, dubbed Hummingbird, which involved a rejiggering of the importance of certain signals associated with search terms. According to Ben Gomes, who has been Singhal’s lieutenant in search leadership for over a decade, Google has made more changes to its rankings in the past three years than it did in the previous thirteen.
The biggest challenge, in every way, has been adjusting to the shift from the desktop to devices that you carry with you. As with many Internet companies, Google search adopted a mobile-centric approach, beginning with how users interacted with it. “Mobile is having a huge impact on how we approach design,” says Jon Wiley, the principle designer for search. (Wiley recently took a role involving design of a wider range of products.) One of the first things he did when he took leadership of search design was to integrate his mobile and desktop teams into one group. Originally, the idea was to put massive efforts into phone — now, he says, it’s all about viewing search as a multi-device experience, with the Google cloud as the constant.
When it comes to major changes, there’s little mystery as to the ones Google’s search team regards as most significant. Search czar Singhal can tick them off easily. “The huge thing was the Knowledge Graph (Google’s vast organization of the world’s data) — as soon as you build that, you basically know facts about real world things. And the second piece is voice — because I really can’t type here,” he says as he gestures to the Samsung smart watch on his wrist. “And then we realized that we need some science behind predictions, so that people don’t have to ask all the time, and that’s where we built Google Now.”
Knowledge Graph structures the world’s information in a vast database. Voice Search incorporates spoken language into Search. Google Now tells what people want to know before they ask. All three, not coincidentally, are tied to Google’s focus on mobile. Though certainly not an exhaustive list, those components — and the way they work together— have helped transform Google Search in the past three years, from a delivery system of “ten blue links” into something almost psychic: a system that doesn’t behave like a computer but an intelligent hive of knowledge that wisely interprets and satisfies your information needs. And it did it all when you weren’t looking.
When Google bought a company called Metaweb in 2010, the announcement didn’t make much of a splash. Yet that acquisition turned out to be the lynchpin of one of the most significant changes ever in Google Search, augmenting the ten blue links with what is essentially a vetted dossier on the subject of queries involving people, places and things.
Metaweb was founded in 2005 by Danny Hillis, a well-known computer scientist and entrepreneur. Operating from his company called Applied Minds, Hillis concocts an array of innovative projects, but he deemed this one so big that he spun it off into a separate company. Metaweb, which launched in 2007, was one of the first big exploitations of what was known as the Semantic Web — essentially a means of processing multiple databases into a format where the information within was easily readable, as if everything resided in one huge repository. “We’re trying to create the world’s database, with all the world’s information,” said Hillis. Since it scanned the Internet to answer questions, Metaweb was widely seen as a rival to Google. But after a few years and over $50 million in funding, Hillis realized that the idea could only reach fruition as part of a bigger company — namely, Google.
At the time, Google was already providing some direct answers to questions: if you typed in Barack Obama birthday, it would cough back, on top of the search results, “August 14, 1961.” But, as Google explained on a July 2010 blog item announcing the purchase of Metaweb (and its database of 12 million “entities” of people, places, and things), its search engine was at a loss to answer questions like, “colleges on the west coast with tuition under $30,000” or “actors over 40 who have won at least one Oscar.” The blog post promised that Metaweb would help Google provide such answers.
“When Google bought Metaweb it knew that the notion of ‘things’ would become a very important part of search,” says Emily Moxley, a product manager who has been on the project since 2011. “We were thinking this is a great way to surface some really quick summary facts and information about things people care about.”
In May 2012, Google rolled out the Metaweb material, dubbing it the Knowledge Graph. It had grown from 12 million entities to a half a billion. The product provides, when it deems it appropriate, a supplementary answer to the organic search results: a panel of key facts about the subject, placed to the right of the normal rankings. It’s kind of “I’m feeling lucky” on steroids.
In describing how Google figures which queries might merit a knowledge graph result, Moxley references the Interstate highway system in the Richmond, Virginia, area. Travelers from the Northeast headed towards Florida know this situation well — just north of Richmond, Route 95 forks, giving drivers the choice of sticking with the main north-south artery and going through downtown, or splitting off to take Route 295, which circles around the city and re-joins 95 south of Richmond.
As she explains it, when you provide a search query, Google expands it into alternate forms and synonyms and such, and then provides an algorithmic test to see if it might be relevant to a Knowledge Graph result. “And then what happens is you sort of get off that 295 exit and say, ‘Okay, what are some possible Knowledge Graph things that might be interesting for this query?’ — and we search of all these documents and return relevant ones. Then you join back up with 95 and we say, ‘Okay, we thought this stuff was interesting, so let’s surface that information more prominently.’”
In the two-plus years that Google Search has integrated the Knowledge Graph, the company has continued to evolve the product. (Google won’t say officially what percentage of queries evoke a Knowledge Graph answer but appears comfortable with a ballpark estimate of about 25 percent.) Originally, the Knowledge Graph was somewhat static. But the product is slowly taking on some of the learning capabilities of Google search itself, in terms of analyzing user behavior. Moxley cites the example, “Who plays Barf in Spaceballs?” The Knowledge Graph has seen enough queries by now that it knows how to apply a schema that involves an actor and a movie — and, presto, Google will deliver a panel with John Candy’s name and picture on it. (You can try this “Who plays X in Y?” question, Mad Libs style, with any movie and role.)
The Knowledge Graph has also made strides in another important area — freshness. Since Google is putatively supplying the single correct answer to a question, its information must be up to date. Otherwise it’s simply wrong, and the user is worse off than not searching at all. Moxley says that when Knowledge Graph first launched in 2012, a change in one of its entities — say if Volkswagen decided to hire a new CEO, as it did the week before our interview— it might have taken the system as long as two weeks to reflect the change. Now the system can process that news and make an adjustment in minutes. Yet she admits that this specific Volkswagen CEO query is both a success and a failure for the Knowledge Graph. The new CEO is not assuming the post for several months. The Knowledge Graph still shows the current leader, but many of those who type Volkswagen CEO into Google are probably seeking information on the new guy. So though the Graph is correct, its response may not satisfy users.
Google has a vast road map of improvements to make, first of all in adding domains; it recently added knowledge of automobiles, video games and Hugo Award winners. But she says that Google is also trying to figure out how to deliver more complex results — to go beyond quick facts and deliver more subjective, fuzzier associations. “People aren’t interested in just facts,” she says. “They are interested in subjective things like whether or not the television shows are well-written. Things that could really help take the Knowledge Graph to the next level.” It’s almost as if Google doesn’t want you to feel like you are conducting a machine look-up, but instead consulting an Oracle who’s not only omniscient but a culture snob as well.
But there is still very far to go, and the raised expectations from what the Knowledge Graph can surface are found to cause continued frustration from its lapses. Moxley herself got peeved recently when she realized that the Knowledge Graph, which does know about television shows, lacked information about new episodes and when they are streamed. “I want an alert that tells me there’s a new episode this week, and I also want to know where it’s up on the website so I can watch it,” she says, vowing that eventually Google will pass through this “middle stage” when it has yet to catalog just about everything.
Speaking of raised expectations, perhaps the Graph’s most glaring lapses are those two questions cited by Google itself when it bought Metaweb in the summer of 2010. Four years later, its search engine doesn’t provide one-stop answers for either “colleges on the west coast with tuition under $30,000” or “actors over 40 who have won at least one Oscar.”
Once it realized how pervasive mobile technology would become, Google decided to make a subtle but huge change to search. Instead of viewing queries as instructions submitted to a computer system, Google would regard all input as conversational. “It’s pretty clear that when you have this sort of device [he lifts a phone to illustrate], speech is going to be important,” says Ben Gomes. “It’s also pretty clear that people speak more naturally than when they type.”
This reversal not only involved changing the way the search engine processed queries. It meant changing us. We were now supposed to regard the search field — whether on the desktop or on mobile — as something one speaks to, even when we are typing. “People didn’t think in queries before Google came along— we educated people for years to think in queries,” says Tamar Yehoshua. “But wouldn’t it be easier if you just conversed in the normal way you conversed and you didn’t have to think about it and if you did that always? That would be ideal.”
Making that change required two initiatives. First, Google’s search engine had to up its game to listen more carefully, parsing even semi-garbled audio input with the skill that only humans had previously displayed at that task. Then Google had to make sure that when people spoke to their phones — or conversed colloquially by text on the search field — its system would know what the heck people were talking about.
To be sure, Google had been dabbling in voice recognition for some time. “We definitely saw many years ago that these building blocks — voice, natural language processing — would be important,” says Yehoshua. “We knew those were investments, unsolved problems in technology and it would take years to get to fruition.” For a few years in the mid-2000s it ran a service called Google 411 that did the same thing the telephone company did when customers dialed its (paid) number lookup system. Google used those millions of free calls to learn how to correctly interpret voices from multiple languages and multiple accents among same-language speakers. This was incredibly useful, but in certain parts of the world Google wasn’t getting the samples of around 2500 phrases that it needed to parse voice input. So the company started dispatching small teams to various regions, preceding the visit by circulating a message within the Google network saying the company wanted to collect voice samples. The Indonesia effort was typical. “Nine hundred people showed up the next day,” says Linne Ha, a Google speech specialist who came to the company with an MFA in creative writing. When Google does such studies, it collects samples in field conditions appropriate to region: it recorded subjects on the streets of Hong Kong and in the Parisian subways.
The effort is paying off — Google Search works with 159 languages, and Voice Search is now operating in 58 of those. Google claims that the app’s “word error rate” has been cut down to 8%.
Gomes himself proudly points to one milestone in Google’s progress here: he now does voice demos himself. “My accent is very non-standard,” says the Indian born engineer. “My vowels are American but I don’t pronounce the R.” Before this voice initiative, Gomes would never demonstrate Google’s first efforts at speech recognition himself: instead, the company used an in-house specialist, a fellow with a perfect American accent who was blissfully simpatico with the machine. Now Gomes has lost track of the guy. “He’s not as critical for the job anymore, he says. “I can do demos. You put me in front of a reporter and ask me to do the queries, and I’m not terrified.”
Google also had to give some thought to the way the phone would talk back to people. Should it have a quasi-human personality like Siri, or use an identifiably robotic tone (sans cute name) to sustain the user’s awareness that he or she is talking to a system and not a replicant? It chose the latter. (You want personality in Google search? Enjoy the doodles.) Google’s search design lead Wiley says that in order to pull off the illusion of speaking to a conscious entity, properly, you would need to automate a Pixar-level mastery of storytelling. “I think we’re a long way from computers being able to evoke personality to the extent that human beings would relate comfortably to it.”
But technology has moved quickly enough for Google (and, to be fair, some other companies) to provide a level of voice interaction that had eluded researchers for decades. “I think three or four things happened to make this possible,” says Gomes. “Obviously computers got faster and processing got better. The hardware — microphones—got a lot better too. There was also progress on the software algorithms. But the biggest change was our ability to understand language.”
Fernando Pereira, who holds the title of Distinguished Research Scientist in the Search group, has been working on natural language processing (NLP) for three decades. Over the years, he says, Google had gotten very, very good at figuring out how to take search queries and match them with documents from the web and other corpora of information. “When you’re doing search, there’s a good chance that the words you use in your query will appear in some of the results,” he says. But adding databases like the Knowledge Graph to the search engine brings in new challenges and opportunities. “It’s harder to anticipate whether the language you use matches the way the database is designed,” he says.
On one hand, this is difficult. When Google gets a query like, “Where do the Giants play?” it has to know a lot of things: that the query involves sports, that a team “plays” at a home stadium, and so on. And it has to make choices — is this the baseball Giants or the football team? Does the user want to know where the team usually plays its games, i.e. the home stadium, or where it’s playing next week? Google uses signals and previous user behavior to nail the answer. “All that figuring out, all that inference, is stuff we do now that we were not doing a few years ago,” says Pereira.
Once those hurdles are cleared, Google’s NLP system can get a further boost from the Knowledge Graph “We begin to understand things in the world,” says Gomes. This enables Google to correctly guess what the user is asking for even if the query is inelegantly phrased or even garbled. For instance, says Gomes, when someone says “David Cameron” into a phone, the system already knows that those two words are commonly paired, and that it is a male person — who could thereafter be referred to by the pronoun “he.” The Knowledge Graph also can figure out that the British PM is the subject matter if the microphone doesn’t quite capture the surname.
The more Google understands, the better it understands you.
In 2004, I asked Larry Page and Sergey Brin about their long-term vision for search. Larry suggested it would be included in people’s brains. “When you think about something and you don’t really know much about it, you will automatically get information.” Sergey noted the key point was, “You can just have devices you talk into or you can have computers that pay attention to what’s going around them, suggesting useful information.”
In 2010, two engineers working in the Android organization, Baris Gultekin and Andrew Kirmse, embarked on an extracurricular “20 percent” project very much in the spirit of that vision, creating what has become Google Now.
According to Gultekin (who moved from Google Now last fall into another project at the company), the product has hewed closely to their original pitch document. “The core statement was that your phones aren’t smart today, but they can be,” he says. “What if we could combine the power of that sensing, powerful, connected device, with the power of Google?”
In other words, Google Now would answer the queries that you were too lazy or otherwise occupied to ask. This meant combining information in multiple domains to address something important. Gultekin says that creating a system to do this was at first terrifying, but he and his partner began to breaking down how this could be done for a single domain, commuting. Even something as limited as that required substantial knowledge on the system’s part: locations of home and office, best routes to take, traffic patterns. It certainly helped that Google Maps (and later, its Waze traffic app) knew how to navigate the grid — but that was very much the point. Google would use all its powers to augment this search tool. Soon, they had a credible app to help commuters. “But we didn’t want it to be just a commute app,” says Gultekin. “We wanted it to be a proactive assistant that handles many things.” So they launched Google Now in July 2012 with seven domains: commuting, flights, sports, nearby places, travel, public transit and weather. Now it has over 70, and counting quickly. “My ambition is that Google Now should provide you with most of what you need and everything else becomes a fallback in case we didn’t have what you needed,” says Gultekin.
The effectiveness of Google Now hinges on merging a deep knowledge of the world — all that Google search can provide, including the Knowledge Graph — with an abundance of personal information. That’s why one might argue that this subset of search is really a synecdoche for Google itself: every time it delivers a “card” of just-in-time information, Google Now draws on a vast array of Google services. A typical card might combine information from one’s personal mail, calendar, and contact list with transit schedules, traffic information and weather.
Quite often, people don’t know exactly what Google Now does until it does it. For instance, when you park your car, Google Now will take note that you stopped driving and remember exactly where you left your car — just in case you forget where it was. If your emails tell Google Now you’re looking for a home, the service might push you photos of open houses in the areas you hope to live in.
As Google Now evolved, it shifted from a 20 percent project into a full time service. But the biggest accelerant probably came in 2011, what Apple released Siri, generating a mini-panic in the Googleplex and plenty more resources for this voice-oriented project. It became an official part of the Search organization, though the team officially is co-located in both search and Android. This was fitting as, in addition to the non-queried messages Google Now pushes to users, all that personalized information will eventually become available through the signed-in user’s search field. (Working today: How long will it take me to get to work? Coming soon: Where’s my car?) “Search and Google Now are very complimentary,” says Gultekin. “”We’d like to give you the information before you search, but there’s going to be so many cases when we won’t know that your pipe just broke, and you need a plumber.”
(Of course, in the future, via the Google-owned smart-home company Nest, Google will know when your pipes burst, or whether your house in on fire. Gultekin. says that Nest integration is “maybe in the future but not today.”)
In contrast to the old version of search, Google-Now-ified search only clicks if you are all-in with Google products. “Larry [Page] has a saying — ‘Search should understand what you mean and give you what you need,’” says Yehoshua. “It’s one Google ecosystem — if you’re signed in on your phone and your desktop, we can leverage that. If you want to get your flight information and track your packages and any of the information that we get from Gmail, you’ll get that . If you’re not using Gmail, you won’t — but you’ll still get the richness of our voice and answers and all that.”
No way around it — if you want to use Google Now and Gmail is not your preferred system, you’re not going to get full value from Google Now or even Google Search. “It would be very nice to live in a world where we could share all this information,” says Yehoshua. “I don’t see that world happening tomorrow. Clearly on Apple, there are things that are harder for us to do.”
Google quite consciously does not offer Google Now as a separate product. Instead it includes Now as part of is search app. And that app itself is not labeled Search; it is dubbed, quite simply, “Google.” That eponymy indicates how closely that not only how closely Search is affiliated with Google, but now important Google Now is to the company.
Nonetheless, the Google Now component is opt-in. No one gets it without a chance to consider the privacy caveats; the product’s seeming omniscience can still be an unsettling reminder about how much this giant company knows about us. As Google’s grip on our personal information has become more troubling — particularly in Europe, where governments are pushing back with regulations and fines and even a breakup threat — the company’s ambitions to serve us might be thwarted by privacy concerns. Even to those who trust Google, the Snowden revelations demonstrate how easily governments can get access to our information. If Google Now knows where you parked your car, does your local intelligence agency know it, too?
Amit Singhal thinks the first era of search was marked by people visualizing interactions as taking place between them and some wall of distant machinery. The new era — one that we slipped into gradually, perhaps in synch with the way we adopted mobile devices as cyborg-ish appendages — removed that barrier. We expect phones to know what we mean. And we expect search to be equally proficient in providing answers involving our personal information as it is in unearthing facts from web pages, documents, and public databases.
“I see search as the interface to all computing,” says Singhal. “When devices become vanishing or minimal or ambient, how are you going to interact with them? Because most of the time, you need to take an action — maybe as simple as play music, or more complex like Write a note to remind me to buy milk when I’m near a grocery store. Or you have questions like Is my wife’s flight on time?, or How tall is Barack Obama?”
People may take it for granted — they may even complain that Google search is not what it was. But Singhal contends that the search has leaped a barrier that he’s been butting against for decades. “For twenty years I failed as a researcher to do this,” says Singhal, referring to the achievements in Google Search that his team has now accomplished. And he admits that there are plenty more problems to solve. But he bursts with pride when he describing the science behind a kind of query he no longer fails at—the successful response Google provides when someone asks a simple question:
Why is the sky blue?