What are people searching for? What are we showing them? How do they like it?

On one side Yahoo Search had at its disposal all the tools and data to generate rich insights on user behavior while on the other side designers (barring a few) were not taking data driven design decisions. I investigated this problem and came up with a solution that people wanted to use.

The Problem

Lets suppose we want to improve the book author search experience of Yahoo search. To understand this problem, we may ask:

How many people search for authors?
What kinds of queries do people use to search for these authors?
What are the top intents behind these searches?
What user experiences do we provide for these queries and how much do users engage with them?
Do we show ads for these queries? If so, how many and how much do these ads monetize?

The problem was that one needed to go to different resources and people to get answers to each of these questions. Not everyone was aware of these resources. Few were comfortable going through so many reports before every project. Nobody found the notion of designs emerging out of a deep analysis of spreadsheets exciting.


Of the numerous resources at our disposal, I found the following to be most relevant.

1. Query categorization by domains and entities

Queries grouped by entities they refer. Similar entities combine into a domain. Consider the query, ‘movies near me’. It would be categorized under entity ‘movies in theaters’ and domain ‘movies’.

Queries (Q) grouped by Entities (E). Entities combine into Domains (D)

2. Top use cases

Use cases discovered by analyzing user engagement on search results pages for queries under a given entity.

Multiple use cases(U) behind each searched entity(E)

3. Top Cards

Most frequent information cards or search results at first position on search results pages for queries under a given entity and domain.

Top card (C) for a domain (D), top card (C) for an entity (E)

4. Ad performance

Average number of ads, ad clicks and monetization for queries under a given entity and domain.

Ad performance (A) for a domain (D), Ad performance (A) for an entity (E)

5. Query analysis

Allows detailed analysis of user engagement on the search results page for a given query.

6. Visual comparison

For a given query, user experiences from Yahoo and its competitors placed side by side.

Bringing it all together

A structure of how these various resources could be combined into a single view emerged (after a few torturous nights :) by placing them within the domain-entity hierarchy.

I then went through a number of iterations to convert this structure into a user interface.

Final design*

*All volume, coverage, performance and engagement numbers shown here have been changed and don’t reflect real values for Yahoo Search.

Zoom in from Domains to Entities to Use cases
Details and queries associated with the node in focus come on the right side. Clicking on the queries takes user to the query analysis and visual comparison tools.
Users can pivot the visualization to the information most relevant to them.

Choosing to implement this vision in D3.js, I started with my rudimentary knowledge of the library and kept learning more and more as I went through this project.

Case Studies

Lets see how easy Babushka makes answering some of the questions that led to its creation.

How many people search for authors?

0.24% of all queries. Similarly zooming into other domains would tell us that this number is lesser than those for actors, musicians and other celebrities but more than that for politicians.

What kinds of queries do people use to search for these authors?

Like the ones shown here on the right side.

What are the top intents behind these searches?

Primarily to read their profiles, use their quotes or know their books.

What user experiences do we provide for these queries and how much do users engage with them?

Primarily we provide People and News cards. User engagement with these cards is low.

Do we show ads for these queries?

We show around 1.5 ads per query.

How do these ads monetize?

Not very well, when compared to highly monetized categories like sports and travel.


Yahoo search designers, product managers, engineers and leadership all appreciated Babushka for making valuable research resources accessible and easy to use, thus promoting data driven design in the organization.

Looking Ahead

Resources supporting data driven design exist in perhaps all tech companies today. However to the designers who need to use them, these resources are only as useful as they are easy to use. Project Babushka was an attempt to bridge this gap between designers and these resources at Yahoo Search. There are a lot of ways to improve upon it or to come up with even more useful and powerful tools.