Evaluating and improving search experience starts with analyzing the queries searchers are making. But search queries are not the same as search intents.
I’m not talking about ambiguous queries like “java” or “jaguar” — examples that information retrieval researchers often use to illustrate how a single search query can map to multiple search intents. Ambiguous queries are fascinating in theory, but in practice they tend to be rare edge cases.
I’m talking about the opposite: when multiple queries mapping to the same intent. For example, queries like “mens shoes” and “shoes for men”.
Recognizing when two or more search queries represent the same intent opens up a variety of opportunities to improve the search experience.
Recognizing Query Equivalence
How do we recognize that two queries represent the same search intent? We can take two approaches:
- Surface query similarity. Queries that only differ in stemming or lemmatization often express the same intent — especially if they represent the singular and plural form of a noun phrases. So do queries that only differ in word order or the inclusion of stop words. In synthetic languages like German, there may also be differences in compounding.
- Similar post-search behavior. Queries that express the same intent will be followed by the same behavior, i.e., engagement with the same kinds of results. Post-search behavior can be represented in a vector space, e.g., by using the embeddings of result titles. Of course, this behavioral similarity is only possible if both queries allow users to find the same kinds of results.
We can combine these two approaches, e.g., as follows:
- Group queries into equivalence classes based on surface similarity, canonicalizing each query by stemming each token and sorting the tokens alphabetically, and removing stop words, e.g., “mens dress shirts” and “dress shirts for men” both map to the canonical form “dress men shirt”.
- Split these equivalence classes into clusters of similar post-search behavior based on the cosines of vectors associated with queries, thus ensuring that queries like “dress shirt” and “shirt dress” are not considered equivalent despite their surface similarity.
After these two steps, we know that any pair of queries in the same cluster exhibits both surface query similarity and similar post-search behavior.
Is Surface Query Similarity Necessary?
Relying on surface query similarity has pros and cons.
On one hand, it minimizes false positives, since two queries with high surface similarity usually express equivalent or near-equivalent intent.
On the other hand, it excludes a large number of query pairs that express equivalent intents through synonyms, redundant words, or complete rephrasings, e.g., “iphone headphones dongle” and “lightning to 3.5 mm”.
It’s possible to generalize surface query similarity to recover some of these pairs (e.g., though query expansion), but many will slip through.
A more aggressive alternative is to rely entirely on post-search behavior. But doing so pleases a heavy burden on precisely comparing post-search behavior so as not to equate queries whose intents are similar but not quite equivalent (e.g., “pants” and “dress pants”). Making these subtle distinctions requires a lot of care in working with the vectors that represent post-search behavior.
Using Query Equivalence to Improve Search
Once we are able to recognize equivalent queries, what can we do with this knowledge? As it turns out, a lot!
- Query rewriting. Given a pair or set of equivalent queries, we can rewrite all of them to a common query representation that ensure the optimum retrieval, ranking, and other search features.
- Analytics. Rather than analyzing the performance of each query separately, we can group queries into equivalence classes, thus aggregating behavioral signals that would otherwise be fragmented.
- Machine Learning. The same aggregation that improves analytics also allows us to capture more robust signals to train machine-learned models.
In summary, recognizing query equivalence allows us to transform search queries into canonical representations of search intent, which in turn establish a more robust foundation for optimizing the search experience.