In e-commerce, solve 3/4 of search issues

We often rely on anecdotal evidences for search quality that can blindsight us from the real issues. Eg. My aunt searched for “show me silk sarees for cousin’s marriage” and she found “Men silk ties”. I do not argue that this is a pathetic search result but how many such queries your users type in a month? Solving such queries would wow the user but before solving that check if normal searches are working fine?

Many broken keywords get reported to the search team. Given all the challenges in search, art is to cut through the noise and solve the issues that are most impactful. I believe that spending time on 3/4 of queries can put search in 95% relevance range. Beyond these issues, it becomes easier to navigate in the territory of advanced search like NLP, semantics and context based searches to chase 99+% accuracy targets.

key to identify these important 3/4 of search problems is to look at how the users navigate and what they primarily type in the search box.

Incase, you do not have searched keywords data, I would suggest to immediately focus on fixing analytics to validate anecdotes as the first step.

Moving towards the main subject, there are some quick insights that can help you sieve the the real issues.

Insight 1: Find the ratio of browse vs search users in the site. For fashion e-commerce this can be a 80% — 20% ratio. For horizontal players, this can be a 60%-40% ratio due to their huge catalog.

Keep browse users, who use links & banners to navigate, also in mind while taking up any change. In many organisations, there is a common infrastructure to serve navigation links, banners and typed searches. The issues that cut across all three discovery paradigm are most important, beyond that pick the issues that impact most number of users based on their behaviour.

Insight 2: About 90% of the search queries are Mono-grams (single words) or bigrams (two words). Anecdotal evidence may not follow this pattern.

Why would that be? Most of the first time searchers use Google, which has been a text based interface. Further, typing a long query takes a lot more effort from users. Hence while using the text box, users are trained to type the most important words that would yield relevant results. Instead of saying “I am looking for Levis jeans”, which is equivalent to offline store request, users have been trained to type “Levis Jeans”. Essentially, Google and users have made your life easier to search. That said, conversational style, I believe, is suited for voice based interfaces (a different beast altogether).

Insight 3: In fashion, about 75% keywords are directly based on Brand, Category and Gender(B-C-G) combinations. However, by volume, around 95% of these search for e-commerce map directly to the catalog names.

You must find a equivalent number for your site. As a preliminary step, take help of category managers to make a glossary of relevant keywords, synonyms for all the brands and categories. Also before launch of any new category or brand, review this glossary and map all important terms in search. From system standpoint, build classifiers that identify all the store entities in site glossary. This would ensure that 95% of the users will find relevant results.

We should develop similar insights for any website and concentrate on issues that will impact the maximum user searches.

*All the data is based on experience and not on actual numbers seen. These can vary basis site, duration, seasons and technology trends*

P.S. I would be writing weekly product bytes on discovery and notifications. If there is anything specific you would like to read, please mention it in comments. Also please share your feedback below.