Designing search for your product

Find your way around one of the most universal features 🔍

Shashank Sahay
Muzli - Design Inspiration

--

In our digital experience, we’ve all interacted with the feature Search. We probably interact with it every day, on different surfaces, devices and applications. Search has made our lives easier countless number of times and it still has a long way to go.

Having worked on Search for about two years, I can now say it’s one of the most open-ended and complex problem spaces I’ve come across. A solution utilizing the best tech on earth, to be used even by the least privileged user. No wonder prominent companies like Google, Microsoft, Spotify, and more have hundreds of talented individuals working on it.

In this article, I’ll try to explain the different stages in the Search funnel, how they work, what are the crucial decisions to make in these stages and how to design for it.

An ever-changing problem space

Search is a very universal feature. Almost all the digital products today have Search. It’s a shortcut that allows the users to skip the line and directly get to the information they desire, almost like teleportation within the realm of an app.

But in the words of Mr. Pandu Nayak, “Search is far from a solved problem.

There’s actually no end in sight, in terms of when this will actually be solved. Because the world keeps evolving.” — Ben Gomes

In the context of Search, you rarely encounter problems that have a definite answer. It’s all about possibilities and probabilities. Why is that? To keep up with the growing information repositories online, search systems need to come up with better ways to understand, cluster and make sense of the available information. And no matter how evolved search systems get, there will still be situations where it is not possible to predict the search results a user is looking for with complete certainty. If you look at Google search as an example, you’ll see that they collect feedback at every stage of the search flow. They are aware that their system could generate incorrect results and expect us, the users, to help highlight those anomalies.

Given all the complexity surrounding search, how does one design for it? Well, that involves understanding the steps in the search funnel and learning how to tackle each of those steps.

Now that we know a little more about the Search funnel, let’s look at all these steps in a little more depth.

The Stages of Search

1. Capture

What we have so far: User intent to look for something on the product
Goal: Minimize the possibility of errors and aid user’s intent

This intent is shaped into a query and fired in the search box. And that’s all you get to serve meaningful results to the user.

This stage is all about capturing the user’s query and logging it in the system. But this is also the most open-ended part of the search funnel. What does that mean?

A user has complete freedom to type anything in the search box. The same query can be framed in different ways. Sometimes it could mean the same thing (“ice cream shop camden town” vs “where can i find ice shops in camden town”) and sometimes the same words could mean completely different things with the same words (“race horse” vs “horse race”).

The most popular way of capturing a query is via text. However, voice is another mode that is rapidly emerging enhancing the accessibility of a lot of products.

Because capturing a query is so open-ended, there’s a great scope for error here. The user could mistype the query, make spelling mistakes or the voice capture could be unclear, both leading to wrong queries and ultimately wrong results.

Things to keep in mind here

  • Provide visual feedback — Always provide visual feedback communicating where the system is at when the query is being captured. Ensure that the user is able to see what they’ve typed or spoken before firing the query. This would also include the different states for voice search (capturing query, query captured). It allows the user to check for mistakes and correct it here itself.
  • Use autocomplete — For text queries, use auto-complete to minimize errors and also save some time from typing out the entire query. Auto-complete is the small dialog box that appears under the search bar that has suggestions for queries based on whatever the user has typed so far. An easy piece to put on the interface, but its true magic lies behind the intelligence that powers it. Learn more about autocomplete — “How Google autocomplete works in Search”.

2. Interpret

What we have so far: User query in the form of text or voice input
Goal: Finding the winning query (explained ahead)

In order to fetch the best possible results, the system first needs to interpret the user’s query. As queries become more conversational and complex in nature, it is imperative that the system is able to understand and satisfy the user’s information needs with information that is available on the product.

Every product employs algorithms specific to their use cases to interpret the user’s query. Once received, this query is evaluated against several checks like spellings, grammar, past successful searches, quality, volume of results and more. From this, the system generates a bunch of possible queries that the user might be looking for — let’s call these competing queries. If the system believes the original query received is good as is, it considers that as the winning query. However, if the system believes the original query is not optimal, it evaluates all the competing queries against the same checks and assigns a score to each of them. The query with the highest score wins. The winning query is usually a data-informed guess taken by the system. For example, If a user searches for “home thetra” on Google, the system maps the query to “home theatre” and shows them results for the same. The system had an option to match it to other competing queries such as “home theater”, “home theory” etc. but the search engine took an educated guess and mapped it to the winning query “home theatre” to best satisfy the users needs.

Interpret the query to the best of the system’s understanding

What we just looked at is actually a very simple example of the system interpreting a basic query. However, query interpretation goes way beyond that. Today, multilingual audiences search for regional terms in English. For example, a Hindi speaking user may search for mustard oil by typing in “sarson tel”, “sarso ka tel”, or “sarson ka tel”. There is no correct spelling here. However, there might be a prominently used spelling that the system is aware of which then becomes the winning query.

We might assume it’s enough for the system to be aware of the searchable entities within the product, but that’s not always the case. Some systems need to be aware of entities that are not on the product as well. For example, if you search for a movie that’s not available on Netflix, it still needs to showcase the next best result for the interaction to be meaningful. For that, Netflix needs to be aware of the entire movie database. As a streaming platform, their goal might be optimising for maximum engagement and a null result page would probably kill that.

For interpreting the query, intricate AI systems and natural language processing capabilities come into play. It’s important to understand your search system’s capabilities, and design accordingly.

Things to keep in mind here

  • Handle all the possibilities with the queries — Define how correct and incorrect queries should be handled with the goal of providing the best possible results, in the context of your product. Focus into why a query might be incorrect for the system, for example, is it because the system has no results to provide for it, or is it because the system isn’t mature enough to understand the language the user is searching in. Detecting these problems need human intervention. When you’ve defined how to handle all the possibilities on the back, you might need to design new interface components to handle them on the front. Interfaces like null result, showing results for corrected query, did you mean, etc. A lot of information components (more in the last section) would come from these possibilities.
  • Ensure transparency & freedom — Always keep the user apprised of the system’s current status. For example, if the system doesn’t think the user query was right and is showing results for a competing query, it needs to be communicated to the user. At the same time, the user must not be bound to a flow just because the system thinks it’s correct. They need to have the freedom to abandon the flow and choose what they want instead.

3. Match

What we have so far: The winning query
Goal: Finding all the matches that exist in the system

This is the part where the system tries to process the query and look for matches in its repository.

On average, the search algorithm at Google is updated almost 6 times a day, which are informed by 200,000 to 300,000 experiments running through a year.— Cathy Edwards

To be honest, the semantics behind this is too complex for me to completely understand or explain (maybe someday!). However, I’d try to paint a picture that helps you understand a gist of it.

Once the search system has a winning query, it tries to look for matches in its information repository. The information repository is often divided into categories. For example, in the case of Spotify, songs, albums, artists and playlists, are the different categories. If the user does not make the selection of the category through auto-complete or advanced search before firing the query, the systems ends up looking in multiple categories for the match.

Most of the systems operate on text to find matches in the repository. However, there are image searches (Google Reverse Image Search), and audio searches (Shazam) in the world too. As a result of this, the system might find no matches, one match, a few matches, a ton of matches, or matches in multiple categories, depending on how vast and nuanced the repository is.

Curation plays an important role here. It refers to the selecting and organising of content that is added to a product. Some products don’t curate their content, and therefore have a low barrier of creation (Youtube, Twitter, Instagram). On the flipside, some products have curation (such as Spotify, Netflix, the App Store) and therefore have a high barrier of creation. Curated information repositories are cleaner with more predictable data. You may have noticed this already — If you search for “that’s life” on Spotify, you get a limited set of matches whereas on Youtube, you’d get a ton of matches for the same query.

When we have figured out the matches that need to be shown to the user, we need to define how these matches will look like on the interface. You could have a generic approach and follow one design schema across the board, or you could choose to cater to different sets of matches separately. When you fire a query “champions league” on Google, you’re shown a good looking card containing all the relevant information. That’s a call Google made to showcase high precision matches differently instead of using the same design schema for them. Similarly, you can see Google using specific schemas for actors, movies, diseases, sports leagues, and more.

Things to keep in mind here

  • Finite & Infinite Spaces — Understand the perception your users have of the kind of results they expect on your product. The two possible perceptions here are — virtually finite or infinite. This perception matters because it defines the user’s intent behind searching. In a finite space, the user knows what they’re looking for, all they need to do is spot it and move forward. Here, the interface needs to be easy to scan. Example, when you fire a query on WhatsApp (finite), you’re looking for a specific chat, or a group. But in an infinite space there are possibilities of multiple results that could match the query. The user needs to consume the results, decide which one they wish to proceed with and then move forward. There’s significant decision-making involved here. The interface needs to be easy to consume, which is subject to change depending on the information type. Example, when you fire a query on Google (infinite), you don’t necessarily know which website you might be looking for.
    Searching within a curated information repository would generally lead to a finite space of results. Whereas searching through an uncurated repository would lead to virtually infinite space. However, given enough time, curated information repositories might also grow into infinite spaces (Amazon).
  • Homogeneous or Heterogeneous Experience — Identify the searchable categories in your product (songs, albums, artists, playlists and so on). When you have multiple categories to surface results from, you could work with a heterogenous experience or a homogeneous experience, depending on what serves the user’s needs quicker. In a heterogeneous experience, results from multiple categories could be presented together. Example, Amazon showcasing results from Prime Video on the e-commerce store. Whereas in a homogenous experience the categorization is respected, and results from one category are presented together. Example, LinkedIn’s search. (Visual examples ahead)
  • Don’t forget the edge cases — When a system finds a lot of good matches, it’d list them in some order on the interface following a design schema to present the results. But when the system finds no matches, one or unusually low number of matches, there are chances of the interface looking blank or the user being in a state where they don’t know how to proceed. The user is looking for answers, and even if the system can’t find it appropriately, it should communicate that with the user. The goal here is to provide the user with alternatives or ways to proceed further with their information need. Some of the popular methods in such cases are showcased below — null search, error states, specific matches, category tree, edit search.

4. Rank

What we have so far: All the matches in the product
Goal: Ranking the matches in an order that’s useful to the user

By now the system has processed the query and found matches to it as well. But here’s the tricky part. These systems, in spite of employing a ton of algorithms, lack the perception that we humans have. Where a human sees an iPhone 12 Pro with its shiny stainless steel design, the system sees a title, some images, and a bunch of metadata. We are able to differentiate between a pair of Nike shoes vs. a pair of Puma shoes. The system might not be able to since it primarily relies on metadata to find this differentiation. Hence, the system might not be as accurate as the human receiving the results.

In theory, the matches found could be a mix of good and bad matches — ones that satisfy the user’s information need and ones that do not. However, in reality, the system rarely makes segregation in such a binary fashion, instead it ends up with a mix of both spread across a spectrum. In order to rank these matches, the system now needs to assign a score to these matches. Based on these scores, there are all kinds of matches — very good, good, not so good, bad and very bad.

But how would the system go about scoring the matches? A simple answer could be scoring them by relevance. The good matches should be shown first, and the bad matches should be shown later. But it’s not really that simple. Let me elaborate with an example.

“A few years ago, when users on Google searched for ‘did the holocaust happen’, the top results promoted pages that claimed that the holocaust did not happen. The reason behind this was that the higher quality webpages may not really bother to explicitly say that ‘the holocaust did happen’. These webpages are talking about the holocaust from the perspective that we, as informed citizens are aware that the holocaust did happen. And so the only kind of websites that had the exact combination of terms that seemed to closely match the query were the ones that say that “No, the holocaust didn’t actually happen, it’s all a big hoax”. These results were low quality results, even though they’re more relevant to the query.” — Meg Aycinena Lippow

This is merely one use case, but it opens up the possibility of an entire class of problems. And just solving the specific problem wouldn’t really work. Because the problems you witness with Search are rarely singular, there are a lot of frameworks in play beneath the interface that we see.

This example helps us understand that relevance alone might not be enough to score these matches, the quality of the results should also influence the scores. Parameters like relevance and quality are specific to the product and every product might have a different definition for these. To Google, web pages coming from The New York Times are good quality results, whereas to Amazon, it might be a product with a complete set of metadata. Adding on to this, there’s no end to the number of parameters you might need to score the matches. It totally depends on the use cases your product is supposed to handle.

Things to keep in mind here

  • Optimize for Precision or Recall — Depending on the product and its use cases, a system could be optimised for precision or recall. A system that is optimised for precision would minimise the number of bad matches returned. For example, a user using the Cmd/Ctrl + F feature to search within a document probably cares more about the preciseness of the match. A system optimised for recall would show a mix of good, not so good, and bad matches to best satisfy the users query. When a user looks for photos on the Google Photos app, the chances are that they are looking for an object/a person/a location/a pet etc. It might be hard for the system to predict what might be meaningful to the user. Therefore, it might be worthwhile to show partial matches or bad matches as well to help with recall and recognition.
    Define what your product should focus on — precision or recall. It’s possible that you might need to define this at the level of a use case and not the product. To facilitate that, define the types of use cases and define what the matches for those should be optimised for.
  • Classify the different types of queries and define system’s response — Understand the different kind of queries your product receives to devise a way to classify these queries. Consider looking at it from the user’s perspective (eg, song, movie, shoe, etc) and the product’s perspectives (eg, broad or specific query). It’s possible that you end up with two or more sets of classification. Defining the type of queries will help you define how their corresponding experiences should be, and if any of these query types need special treatment or not. This will help you define the design schema for different types of results. For example, on Amazon, shoes are showcased in a grid view giving prominence to the thumbnail, whereas mobile phones are showcased in a list view giving prominence to the textual information. Not just that, depending on the category the user is searching in, the hierarchy of information also changes.
  • Define the scoring parameters — Define the parameters your product should be considering in order to rank these queries. These parameters could be generic, such as relevance, recency and quality, or specific to your product, such as stock count or number of images. You will also need to define how much influence each parameter should have over the scores.. In certain cases like the example “did the holocaust happen”, one might prioritize quality over relevance, i.e., quality would have more influence on the scores than relevance. On the basis of these parameters, every match would be scored. In certain cases, all the parameters might not apply to all the different query types, and that might also need to be defined relevant to the context of your product and use cases.
  • Don’t forget uncategorized queries — As users evolve, their behaviours evolve, there might always be some new kind of queries that your product receives. Have a default set of scoring parameters to handle these uncategorized/new queries.

5. Presentation

What we have so far: A ranked order of matches to serve the user’s intent
Goal: Presenting the results in a structured and meaningful manner

Now that we’ve figured what needs to go on the interface, let’s also discuss how they should look like. Note that these are just some of the best practices. Depending on your product needs and intuition, you might end up with different presentations. Let’s go through the building blocks of the results page to understand this step better.

Components of the search results page

a. Result Schema — The first part of defining the presentation is designing the default design schema for search results. If your product has a need to treat results from different categories differently, you will need to define schemas for all these categories. For example, WhatsApp uses the same schema for contacts and groups, whereas Amazon uses different schemas to showcase shoes and phones. Products use different schemas for different categories to make their consumption easier and to fulfil any user needs specific to those categories. Not all products might need different schemas to differentiate between categories.

Design of a schema should help the user consume information in a structured and easy manner. And at the same time it should also be flexible enough to accommodate different use cases that may exist in the product. The task of designing schemas sounds easy, but can get fairly complex depending on the metadata associated with the results. These schemas need to be robust to accommodate for the different states of the metadata as well.

b. Categories — Your product could have the need to showcase results from multiple categories. The two ways to do that are homogenous results and heterogenous results. When all the results from one category are kept together in one cluster, they’re called homogenous results, whereas if results from different categories are listed together with the ranking defining the order, they’re called heterogenous results.

The image above sheds some light on the popular design patterns for homogenous and heterogenous results. Tabs are one of the most popular pattern to manage multiple categories. Some products also give the user the flexibility to use the tabs before typing the query, so that the search is carried out only in the selected category (LinkedIn). After firing the query, the tab can be used as a filter (Instagram). Generally, the first tab in such cases is “All” or “Top”, which means the results under it are heterogenous, but the results under rest of the tabs (Images, Songs, Accounts, etc) are homogenous. Another popular way to work with multiple categories is to cluster results from one category together and present them as blocks one category after another. The user has the freedom to drill down into the desired category, but do note that this could affect the ranking of results.

c. Information Components — You could have the need to communicate crucial information to the user, take all such pieces of information into account and design components for these. You could follow the same schema for these all or treat them differently, it’s your and your product’s call. These information components are often shown before the results as they could carry very crucial information in regard to the user’s query, possibly impacting some of their decisions. These components tend to be vertically small and also house links and buttons for quick and contextual actions as they’re often perceived as a part of the header.

d. Detour Components — A lot of times certain products make use of additional components that allow the users to direct their search flow, letting the them diverge or converge in their ongoing journey. These components derive their usefulness by being close to the user’s original intent. That means these components could be useful in the user’s journey, however if they’re not, we should ensure that the component is ignorable as well by making them blend well with the interface. These components look like a part of the product, and aren’t too loud. In order to be spotted by the user, these components have a different layout than the results schema, which breaks the monotony on the results page. Because they need to house a good amount of information but cannot take up too much real estate, these components often end up taking the form of a horizontally scrolling menu. Depending on what kind of user need the component serves, its rank in the page can be determined. For example, if on an average you see user drop offs after a certain scroll depth, it might be helpful to place a diverging component there, providing users an avenue to explore on the line of their original intent. Some products also use these components for ads and sponsored products.

e. Pagination or Infinite Scroll — Now you’ve got all your components in place. The end of the page/screen could either have pagination allowing users to explore results on different pages as they go deeper. Pagination allows for any additional components that house some non-crucial information and links. Infinite scroll cuts the scope for any such components. As the user keeps scrolling, more results keep loading on the page/screen.

Putting it together

Once you’ve got all the components you need for all your use cases, all you need to do is put them together in an order in which you’d like to answer the question asked by the user.

Conclusion

Search is a very volatile space and the problems keep evolving with the users and the product. When you optimize the system for one use case, you also need to assess what its implications might be on other use cases across the spectrum, which makes it difficult to solve for. I’ve thoroughly enjoyed working on Search and would love to learn more about it and work on it in the future.

Don’t forget to clap for it 👏

This is the longest and the most effortful article I’ve written. If you liked it, don’t forget to clap for it (the more the better) so that others might be able to read it too.

Hope you found this article useful. Feel free to drop any feedback in the comments. And if you want to chat about design, product or tech, feel free to drop me a DM on any of the platforms.

--

--

Product Designer at WhatsApp. Prev at Flipkart, Microsoft. Mostly wears black. Bear hug expert. Loves food a lot more than he should.