Reviving an e-commerce search engine using Elasticsearch

Published in

Quantyca

10 min readJan 20, 2020

Appropriately designing the search function for an e-commerce site is essential for correctly retrieving items a user is looking for. Nowadays, Elasticsearch is one of the most interesting tools for implementing search functions. In this blog post, I will describe how we refactored and significantly improved a poorly performing search engine using Elasticsearch.

Introduction

Searching is an everyday task we all perform while surfing the Internet. Whenever we type a search keyword into a search bar a set of results is presented. Generally, results items are sorted by relevance with respect to our keyword: most relevant items are on top of the result set, while less relevant items are in the tail of it. This ensures the shortest time and the minimal effort in searching for what we are looking for.

The software system employed to retrieve items, to compute the relevance function and to produce the final result set is called Search Engine. Depending on the context, results returned by the search engine are disparate in type. They could be textual information, images, videos, products of an e-commerce site, and so on so forth.

Just after web search platforms, e-commerce sites are the area where search engines are most employed. Having the most relevant items returned on top of the list is fundamental for e-commerce, especially in terms of potential sales. Appropriately designing the search function is the key to match this business target.

Problem outline

Quantyca was engaged to refactor the search engine of an Italian retail company’s e-commerce site. To address the request, we built up a team in charge of exploring the AS-IS system and proposing a solution to our customers.

The starting point was the analysis of the AS-IS system. We examined the performance of the original search engine by performing some searches on the site. Based on analytics data collected in the preceding six months, we mainly searched for those products users clicked the most. Throughout this process, we spotted out diverse issues (pain points) which we sum up into the four categories listed below.

🔀 Consistency

Given a search keyword, the engine returned products mismatched with the keyword.

Searching for “clementines”, on top of the result set we got: sandwiches, clementine orange juice, biscuits, milk chocolate.

Clementines are not present anyway!

🔽 Sorting by relevance

Given a search keyword, relevant results were not returned on top of the result set.

Searching for “sliced bread”, on top of the result set we got: biscuits, breadsticks, whipped cream, sliced bread.

Sliced bread is present, but it’s only in the fourth position!

#️⃣ Cardinality

Given a search keyword, irrelevant results are returned in the tail of the result set.

Searching for “distilled water”, we got: mineral water, distilled water, …, desserts, ice creams, convenience food.

Searching for a very specific item, such as the distilled water, we got a result set containing more than 4000 products!

🔤 Attributes significance

Given a search keyword, the engine retrieves irrelevant products because there is a match on unimportant attributes.

Searching for “orange juice”, we got the following products: orange juice, red-orange juice, bitter orange juice, washing machine additive.

WAT??? Why the washing machine additive was returned by the search?

The answer lies in the concept of indexing. When designing a search engine, an index has to be specified. An index is a set of items’ attributes on which the search algorithm seeks a match with the search keyword.

Back to the example, there was an indexed attribute of the washing machine additive, specifically named Marketing statement of the product, containing the following text:

… Captures and eliminates more than 100 oxidizable stains: vinegar, pineapple, watermelon, orange juice, beer, coffee, cola, …

The keyword “orange juice” is matched on an attribute that, according to the logic that a user generally applies in a search task, we consider absolutely irrelevant. In fact, a user searching for orange juice is most likely searching for a drink, not for an item that cleans orange juice stains!

Objectives

In order to present a new solution to our customers, we built up a Proof Of Concept (POC). Our objective was to face all the pain points and to produce a search function able to return better results in any situation. In particular, we chose Elasticsearch technology and we focused on the following functionalities:

indexing;
matching, filtering, sorting;
boosting;
custom score computation.

Throughout the dissertation, I will first analyze the Data Model we designed, then the Query Model we used and finally I’ll briefly show the results we achieved.

Data Model

The first step was the definition of the Data Model to describe the product entity. Thus, we selected a set of attributes useful for the search task and we created the Elasticsearch index.

The Data Model we proposed was deliberately essential.

In fact, we picked up the following product attributes:

product name
product class 1 (high-level class)
product class 2 (mid-level class)
product class 3 (low-level class)
brand

The aim of such a simple Data Model is to stress that, even with a minimum set of attributes, it is possible to achieve excellent results.

The second step in the Data Model setting was the definition of the analysis criteria for each attribute. In Elasticsearch the concept of analysis is the process of converting text into tokens which are added to the index to be matched at search time. The analysis is performed by an Elasticsearch component called analyzer.

We decided to add a custom analyzer to the product name, which is the most important attribute during the search phase. In particular, we picked up the Edge NGram Tokenizer. This analyzer enables the match on the product name even if a substring of the entire product name is matched (e.g. “berry” matches both “blueberry” and “raspberry”). It is extremely useful for matching products with slight variations in the search keyword with respect to their name: singular vs. plural, masculine vs. feminine, spelling errors.

Having defined the Data Model peculiarities, we can now explore the Query Model we created to face the product search problem.

Query Model

The scope of the Query Model design phase was to draw up a generic search template allowing different types of searches and addressing the pain points we listed at the beginning. In this framework, we focused on matching criteria, boosting, score normalization and filtering.

During this step, we used an incremental approach: we firstly started by defining a simple match query, then we enriched it through a fine-tuning process. Here are all the steps we followed.

✅ Matching

We designed the query so as to seek an exact match on all the indexed attributes. As just mentioned, the sole field selected for a substring match was the product name.

The matching query we applied was a Common Terms Query. In Elasticsearch, a Common Terms Query is a query giving, in an automatic manner, more weight to less frequent terms in the dataset and vice-versa. This is a great alternative to having a stop-words list, which requires hard maintenance.

Then, we used the dis_max Elasticsearch function to compute the final score of the single retrieved product. If an item has matched multiple matching clauses, the dis_max query assigns the item the highest relevance score from any matching clause. During the fine-tuning process, we found out this was the best solution for score computation.

🚀 Boosting

In the boosting scenario we explored two types of boosting:

field-level boosting: used for weighing the single match on one attribute. The boost is applied at search time;
global-level boosting: used for customizing the score associated with each item and for changing the order of the products in the result set. This boost is applied after the score is computed and enables further boosts on a sole subset of items.

Regarding the field-level boosting, we applied the highest boost factor to the exact match on the product name. Then, we boosted the match on the product name tokens, on product class 2 and on product class 3 using factors in the interval (1.0, 2.0). Instead, for the remaining attributes, we did not apply any boost factor.

Concerning the global-level boosting, we applied boost factors slightly higher than 1.0 for the following matching criteria:

sequential match on the product name using the match_phrase function; for example: searching for “distilled water”, items having “distilled water” in their product name are boosted rather than those having just “water”;
exact match on the product class names using the multi_match function; for example: when searching for “tomatoes”, which is the name of a product class 3, items belonging to that product class are boosted;
exact match on the brand using a mere matchfunction; for example: searching for “pasta Barilla”, items with brand “Barilla” are returned on top of the result set, at least together with pasta of a different brand.

The last point of this list was a key feature in our work: one of the requirements we had from our customers was to enable a boosting on the home brand, so as to show their own products on top of the result set. This functionality enables the application of marketing and business rules.

📊 Score normalization

With the purpose of limiting the noise and the cardinality of the result set, we proposed an approach for score normalization. Since the score computed by Elasticsearch has a mathematical domain of (0, +∞), it is useful to project it into a limited domain, such as [0, 1], so that we can apply a threshold to retain only the most relevant items.

As far as our POC is concerned, we used the Elasticsearch script_score function to normalize the score and to put a threshold on it. In particular, we set the threshold so as to keep only the top 3% of the result set. This approach ensures a result set less broad than the one of the original solution and free from irrelevant items.

✂️ Filtering

In order to show the performance of the proposed solution to our customers, we built up an interface using React. The front-end was extremely simple: a classical search bar where to type the search keyword and a table showing the result set.

The main features of the front-end application were suggestions on the search bar and product class filtering. Suggestions were triggered and shown on the screen while typing a keyword into the search bar, enabling the search for items matching the keyword inserted up to that moment. Instead, product class filtering was implemented with suggestions like “search Water in Water, drinks, wine, alcohol”, allowing searching for “water” only in the suggested product class.

Concerning the product class filtering, from an Elasticsearch point of view, it was employed using the post_filter function, a function retaining only those products matching the product class name specified in the clicked suggestion.

This concludes the theoretical dissertation. We are now ready to check out the results achieved with Elasticsearch!

Results

In order to evaluate our search algorithm, we compared the original output, the output returned by our POC and the desired output, in terms of:

results on top (relevance);
results in the tail (efficacy);
result set cardinality (efficacy & filtering).

For simplicity, I will discuss the results we achieved for the four search cases we listed at the beginning of the article. Here, I want to stress how a simple solution performs better than the original, especially by fixing all the issues we underlined before.

🍊 “Clementines”

original output (top): sandwiches, clementine orange juice, biscuits, milk chocolate
original output (tail): N/A
original result set cardinality: 10
POC output (top): clementines
POC output (tail): clementines, clementine sauce
POC result set cardinality: 52

🍞 “Sliced bread”

original output (top): biscuits (pan di stelle), sliced bread (at the end of the first page), sandwiches (the whole second page)
original output (tail): meat, ice creams, frozen foods
original result set cardinality: 786
POC output (top): sliced bread
POC output (tail): N/A
POC result set cardinality: 7

In Italy, sliced bread is known as “pan bauletto”. The original search algorithm matches biscuits named “pan di stelle” on the word “pan”.

💧 “Distilled water”

original output (top): mineral water, distilled water (third item)
original output (tail): sweets, desserts, ice creams, convenience food
original result set cardinality: 4321
POC output (top): distilled water for ironing
POC output (tail): N/A
POC result set cardinality: 3

🍹 “Orange juice”

original output (top): orange juice of different brands, Campari soda, washing machine additive
original output (tail): N/A
original result set cardinality: 19
POC output (top): orange juice of different brands
POC output (tail): orange juice of different brands
POC result set cardinality: 42

This proves how Elasticsearch can be easily exploited to solve an e-commerce product search problem.

However, there’s a lot of work to be done so as to further improve the efficacy of the search function.

Demo

It’s demo time!!! 🎮

Below there’s a GIF showing the final result and the performance of our search engine. For the sake of the article, I chose to search for the four products described in the previous examples.

Since our search engine works with Italian words, it is useful to keep in mind these English to Italian translations:

clementines → clementine
sliced bread → pan bauletto
distilled water → acqua distillata
orange juice → aranciata

Finally, note how the products having brand “XYZ” (i.e. the home brand) are returned on top of the result set.

Enjoy!

Final thoughts

I will conclude my blog post with a recap on the benefits met by our solution.

Correct indexing and a minimal set of attributes both guarantee that irrelevant items are not returned, ensure more control on algorithm functionalities and require less effort during the maintenance phase.

A substantial result set guarantees easy identification and, consequently, the potential purchase of the retrieved product item.

A result set sorted by relevance ensures minimum effort and minimum time to find the product you are looking for, and limits the scrolling between results pages.

A result set with minimal cardinality guarantees that only relevant items are retrieved, reducing the entropy of the results.

If you liked this post, clap for it, follow Quantyca on LinkedIn, and read other interesting articles on our Medium profile!