Tech Interview For Non-Tech PMs: Building Google’s AutoComplete Engine

Noticed The AutoComplete Feature On Google Search Or The Chrome Address Bar? Ever Wondered How It Was Built And How It Knows What You Want To Look For?

Ayan Halder

Published in

Product School

8 min readJul 5, 2020

What Is AutoComplete?

Autocomplete is the feature by which a system (typically a search engine) understands what you intend to write. AutoComplete is extremely prevalent now with applications in Google Search, Bing, Amazon among others, and this is an extremely helpful feature — being able to predict at scale what users want to write, and personalizing that based on context.

It’s not a stretch that you might get asked how an AutoComplete works if you’re aspiring to be a product manager with FAANG. Today, let’s walk through how an AutoComplete system can be designed.

The Goals And An User-First Approach

Before diving deeper, it's important to explore why AutoComplete exists. This has to do with understanding why Search exists. What does Google search do? It helps us uncover information from sources that we might not know exist.

AutoComplete is an AI-assisted step forward with users having to type less to uncover information, thereby improving user experience.

As usual, let’s start with the users of this system and see how we can proceed from there. Read my previous blog on why it’s important for PMs to start system design questions with users.

Taking a simple use case of Google search, imagine a user (you or me) wants to know the months when it is summer in California. So, typically, the user would use the keywords summer, California, months.

Now, the goal of the system is to take the keywords and the context into consideration to come up with suggestions. As we see in the picture below, Google actually did that. How did it know?

The Algorithm

Okay, so let’s discuss some ideas through which this can be done (in Google’s context). This is a problem of association. We need to know which words need to be associated with the ones that the user has already typed.

Now, a common misconception is that you need to code the algorithm on a whiteboard. As a PM, you’re not expected to code the algorithm. The word “algorithm” sends chills down the spine of non-tech PMs and subconsciously, most lose the battle there.

Rather, let’s do something interesting to come up with the algorithm.

Remember the famous game “What comes to your mind when you hear X”?

How do you play that?

Say I ask you what comes to your mind when you hear “beach”. If you’re an Indian, almost instantly you’ll say Goa.

Why?

Few reasons:

You have personally visited the place and you know it’s awesome.
You may not have visited, but you have friends who did and they can’t stop raving about the fact that Goa is awesome.
Goa has more number of beaches that possibly any other location in India; and the diversity between the North and South is noticeable. So you know you’ll get to enjoy multiple aspects within the same city.
You may not have visited, neither have friends who did but you have read about it on the internet and in newspapers, and you know that it’s a famed international destination.

Now, let’s try to pull parallels between what we talked about above and the Google Summer example. What are some of the ways Google might know what you’re looking for?

This can be done in two ways:

Using Existing Information (i.e., using publicly available knowledge to associate words):

Fame— Associating summer with famous things such as songs, movies, etc. As you see the “Summer In The City” example in the snapshot above.
Destination — Summer is not just a season, it’s an emotion for many people. And, like me, many associate destinations to the season. California (and specifically Los Angeles) is one such destination but there are many that can be associated with the word “summer”.
Food — Continuing on the same train of thought, many associate food with seasons. So associating the word “food” or even showing suggestions with food names can be thought about.

You get the point. N number of such ideas can be generated and tested.

2. Personalizing the information. Not every food and every destination will resonate with everyone. So how do we do it?

Localization — One of the first things that can be thought about is matching the relevance to the location of the user. So, if I’m near California and not Miami, then show “Summer in California” to make it relevant.
Historical Searches — Okay, so supposedly a user lives in California and loves hiking to the Hollywood sign every summer morning. But recently, he has been searching for Miami, keying in travel dates (say, in Google Travel) that is 2 weeks from now and is also searching for hotels. Can we infer that the user wants to shake up his regular summer plans and travel to Miami? In that case, a suggestion “Summer In Miami” or even better “Summer Blogs From Miami” can be shown to improve user experience.
Friends & Personal Associations — Let’s make it a tad more complicated. Say a user, living in California, has a friend who had traveled to the famed Egyptian pyramids and has sent photos of the adventure to this user over Gmail. Pulling from Google Photos, it can be inferred that the user has been to other international destinations (and so, has moderate to high disposable income). So would showing “Summer Destinations: Egyptian Pyramid” ring a bell to the user? Probably.

In the last scenario, one might question the privacy implications. So it’s always better to say in the interview that there might be privacy concerns and that we’ll take user consent or other appropriate actions to proceed here.

So far, our algorithm was built on 6 specific inputs (or stories) and we didn't code a single thing. What we did was explore how users tend to use the system, and make it more favorable for them.

The Data Structure

Another very scary topic.

Many non-tech aspiring PMs think how they can master a topic that takes years for a computer programmer to learn. Interesting to note here that you’re not expected to.

In fact, you’ll probably not be asked about specific data structures and their run-time complexities but you may be asked how do you envision that this information is stored so that it can be effectively shown to users when needed.

But, What Is A Data Structure?

According to GeeksforGeeks, “A data structure is a particular way of organizing data in a computer so that it can be used effectively.”

As you see, it’s nothing but organizing information effectively with a goal that when the program searches for something, it should get the information in the minimum number of searches.

So, let’s come back to our example and see how we can organize our information effectively.

What do we have in hand here? A Word — SUMMER

What do we need to do when someone types that word in? Show terms and keywords that can be associated with that word.

Great! Can there be multiple words associated with Summer? Absolutely. That is something we have already established.

Now, can we think of some ways this information can be stored? Let’s try.

A Table

Here, two simple tables — one having the central word “Summer” and the locations of associated words from the other table. The second table has associated words. So, when someone keys in “Summer”, the system can look up the locations 1,2,3 in the second table and take those words.

This is okay, but see how the system has to do 2 different searches? First, it has to read the locations associated with “summer”. Then it goes to a different table, does another search, and then pulls out the information.

Imagine what happens when there is another set of words associated with each word in the second table! That is, what if we decide to associated California and Miami with the word Destination! Now, the system repeats itself again and does an additional round of search! That’s time-consuming and expensive.

2. A Graph

Let’s talk again about the problem at hand. It’s a problem of association. What’s a way to “associate” one word with another? A Graph! This can be thought similar to how Facebook or LinkedIn stores user connections. Think of it as a network of N elements (a 2° graph is shown below) that are connected based on some relationship.

Here, in the image below, Summer is one of the nodes that is connected to 4 others. Each, in turn, is connected to some more.

What’s good about this structure is that you can store as many degrees of relationships as possible.

Now, what’s good about this? Apart from the technical facts that searches are faster, a simple look tells us that this structure provides a robust way of personalizing searches for users.

When someone searches for “summer”, this graph can look until Miami, for example, and suggest that as a travel option without having to explore any other branch if the program is written properly. That saves time and processing power, which in turn makes the response super fast.

Now, AutoComplete uses Graph, and here is a cool render of that but you don’t need to know that in an interview. Also, the data is not stored as I have shown above. It is broken down to a more elementary level of letters and has a more complex graph than shown. But the fact that you identified graph as a possible storage mechanism is enough to get brownie points.

You can think of other cool ways of storing this information. Be sure to discuss the pros and cons of what you suggest.

System Design

I won’t run through a complete system design since I have explained in detail here. But, in general, there’s one word you always keep in mind — SLA (Scalability, Latency, and Availability)

Now, a lot of scalability and latency have to do with the data structure. Beyond that, the usual process of talking about multiple servers, load balancing, caching, etc. absolutely helps.

Summary

Technical interviews are not difficult for an aspiring product manager. Do what you do the best — put yourself in the user’s shoes and you’ll find the direction.

Image Credits: Google and my imagination

If you are looking for product management interview preparation and resume review assistance, reach out to me at ayan.halder@live.in.

This post has been published on www.productschool.com communities.