How did we grow up without ‘googling’ everything?

Gwendal Yviquel
ADEO Tech Blog
Published in
12 min readApr 15, 2021

1. Why have search engines become mandatory for an ecommerce website?

As always, ecommerce websites’ technical focus moved with the business. First the focus was on site navigation to make navigation clear, and basically being the first brick. Then the focus was on search engines to ease user driven journeys and new tool, product recommendations. Finally, we are currently on a three tool focus. Search engines for users’ need translation to products or content and custom journey, product recommendation for product page to product page navigation, and qualitative product page to match the huge amount of direct arrival from Google search, where product page needs to be a home page for the user journey.

I will try to jump in search engine context and requirements for an ecommerce website in the next lines.

2. Wait, what is a search engine by the book?

Let’s check wikipedia knowledge base.

‘A search engine is a software system that is designed to carry out web searches (Internet searches), which means to search the World Wide Web in a systematic way for particular information specified in a textual web search query.[…] The information may be a mix of links to web pages, images, videos, infographics, articles, research papers, and other types of files. Some search engines also mine data available in databases or open directories.[…]’

link : https://en.wikipedia.org/wiki/Search_engine

That seems to be clear, the goal of the search engine is to return every reachable content on the web that semantically matches the user text input.

3. From Middle School to Prep School : a growing relationship

Now let me think about search engine cases I grew with… (And feel free to give me yours :))

Early Middle School Memories

As far as I can remember, my first search engine need was about famous authors school homework, I had to make small resumés (I mean small small, it was early middle school) about Victor Hugo, Albert Camus or Emile Zola. I was then going on my parents computer, on the internet provider homepage, and I was typing my author list step by step. I had the chance to find a lot of results about these authors within the first 3 results, most of the time including Wikipedia. My résumés were almost already made.

From today’s point of view, it’s interesting that I had no clue what Google was. This search engine didn’t show me any books to buy, and I was sure that my internet provider was giving me its homemade results.

Middle school first exams preparation

Last middle school year, History exam… Not really my jam, I was not so effective on the preparation, wasting some time trying to download the “most effective” music to do my preparation, or looking for movie tickets to be ready when my preparations will be done… (You feel me right?). At this time I was already using Google, since it was way quicker than anything else. I could find everything about the historical battles I wanted, I could find cinemas tickets but not always in my region, and I had some issues finding mp3 songs to download (to work efficiently once again), and at the end I was just listening to some CDs in my room.
Well I was not so efficient, but at least I did practice google limits and rules. We can see that history and facts, with the help of a well known referential website, are really easy to find, and most of the time these websites are top quality. When I was looking for cinema tickets I had issues finding information about my local cinema without further action on my side and finally looking for mp3 music; It was pretty hard to find what I was looking for, often with low number of results or even no results. Maybe google had some legal concerns?

Accepted in Prep School, time to chill

High school was almost over, I was accepted in the prep school I applied for. It was time to optimize my available time with the best video game and TV shows. Here I was again on Google looking for Resistance Fall of Men, Portal or the Orange Box, etc. I had immediately the closest retailers selling new or used games or some new website selling video games online, only new products but with an interesting price, with another search engine within the website to find my games. Everything was booked. It was time to look for TV shows DVDs, Lost, Battlestar Galactica or Dexter. Same about DVDs, I had every needed results in google, with the closest retailers selling my DVDs and online websites selling products with few subsequent on site searches, both new products only.

It was 10 years ago… Time flies… Funny points, the video games I was looking for had common names, still Google was returning video games and nothing else within early results. Quite the same with Lost, only TV show results. Dexter? Not a single stat about trendy childbirth name, just TV shows. Maybe I was like most of the people looking for it at that time. There was one thing for sure: I just knew I found results immediately for what I was looking for without any effort or wrong results.

4. Let’s jump into Google’s Shoes

At this point we can understand the quick evolution of search user needs and quality of the search environment. We can maybe find 3 business cases around search engines.

Search Engines within internet providers offers

I started internet searches with my internet provider, which was doing more or less the job, but with a poor user interface and a lack of optimization in search results loading.

I did not know at that moment, but my internet provider had no search engine business. It was just about paying a private solution returning results within their environments. Here we can find some clues about the strange relation between the correct level quality of the result, semantically speaking, and the overall user experience.

Search Engine was not in their business plan, but a user identified need, which was a key to lead users to their homepage to drive traffic.

Nowadays, search engine providers Google, Bing, DuckDuckGo, are more famous than the internet provider’s search features or Internet providers overall. As a consequence this business case is less and less used but is still part of some Search Engine side business model (Bing), just rooting their solution to another environment with appreciated service billing or commission.

Search Engines directly answering end users

Here come Google and others, these actors were for most of them behind the internet provider previous solutions, but had no need to use them anymore has a door to a deeper audience.

From the few examples we had, end user search engines were impressive early on to find what we were looking for, and now search engines are just basic. You don’t know something? Just google it. That’s pretty interesting to see an increasing quality and complexity, with no more “no results”, classified results by content and audience needs, but now considered basic.

Search Engines on e-commerce website (or any website by the way)

Last business case, the website search engine. Early on google was not returning product pages as top results, but websites selling products instead, another search engine was needed within the user journey.

Plugging general purpose search engines was not a right answer. If it was, Google would already guide you to your products. Indeed, most e-commerce websites have their own specification and lexical field. At Leroy Merlin, we need to give great results to: product characteristics, EAN, brands, dimensions, materials search queries. A Custom implementation of a search engine is then needed to fit the user needs, and not be a source of deception on such a “basic” tool for the user.

In that specific field of application some new competitors appeared, and some of them were based on a new trend, Open Source solutions.

5. Google is huge, ok, what’s the point?

Historical vision and actors

These moments of life look far from the search tech environment, but I felt it was relevant and funny to realize that we grew along the field of activity we work for today.

To end this part, to understand how important google and search engines are important to our daily lifestyle, here are a few numbers.

In 2021 so far, Google represents more than 90% of search queries, Microsoft being “just” behind with Bing at 6%. Fair, but 90% of what? 90% of global search queries represent more than 2 trillion searches a year… 2 Trillion, is a lot, but how much a lot? First you write it like that : 2 000 000 000 000. Per day, it looks like 4.5 billion searches. We are 7.5 billions on earth (more or less), with only one query per habitant, it would mean that 60% of earth’s population uses Google every day.

Leroy Merlin, where are we?

At this point, as Leroy Merlin, a century old retail company, why are we concerned about how we are connected to Google in our life?

Google is so deep in our life reflexes, that search behavior is applied to most fields of activity. As a retailer how can we not deliver an equivalent “search” experience on our e-commerce website, since google search quality and performance is basic for the user? Long story short, we cannot. Moreover, the tendency is increased by mobile and voice assistant’s usage.

Of course we are not Google, and Google does not commercialize its search engine as it is. Then we had to optimize our chosen solution, and build on top of that the best search experience possible for every user.

Optimize overall performances

First our internal search engine needs to basically return the right results, and as much as possible hide 0 result pages.

Fine, that’s our goal, but how can we do it? A search engine is pretty simple to understand, It will keep in memory every important information/data that can be looked for within a user query, and return matching products or content quick.

I look dummy here, “just index everything, everything to the memory duh!”.

It is not possible to put all of our content into the search engine memory. Still we can assume it was feasible, putting too much in it, like product characteristics, would impact at some point the latency of result retrieval, even ignoring the traffic evolution impact or query frequency. Here is the first Google effect, you can’t have high latency returning search results (and you don’t have Google servers :)).

Still, that is not so complicated, just put the search engine at work with different content indexed, get results and choose the best set up for your live production? Ok so far, but users do not input exact characteristics, sometimes different words are used for the same notion, and the technical product characteristic may not be used by users. Which lead to implement synonyms in different levels of language, slang etc, for each indexed content.

Fair enough, not so easy but doable. Once synonym dictionaries are built, here we go again on the latency benchmark. Now we are done right? Dont u tihnk? Oh I wanted to write “don’t you think”, just a typo…

Here comes the matching part, even with our indexed synonym dictionary, users are doing typos and using their own new slang or shortcuts every day, dictionaries cannot predict it. There are different solutions, we can think about the fuzziness parameter which allows to match early part of words, “think” and ‘thin” would be a match then. Computing a semantic distance can be a solution too, Levenstein for exemple, where the number of character changes to become the targeted word correspond to the distance. Both solutions have drawbacks on performance and quality of results, and these drawbacks can differ depending on the word looked for. You guessed it? Latency benchmark!

We are done! Right? Right?

At this point we are somewhat done with the indexing and latency part. But we kind of forgot something? Maybe search result quality from a user point of view?

Let’s give an example, “small red shovel” and “red hoodie with small shovel on it” does match the same way from “red small shovel”. That is where we need to add rules to weight differently indexed elements, length of the query, and word order within the query.

This is again a benchmark/optimization time. We look for overall relevance but this time it is not only one metric to follow…

Here are some of them:

As you can see, none of them are perfect, they all need to be considered, and maybe with a different weight. Furthermore, these metrics are not out of nowhere to define what is “good” and what is “bad”, they all need manual annotation/qualification or search engine history associated with clicked content, or both of them to be run.

Last but not least, what is the point? We need to display best products matching most of the user queries, head terms with more reach or long tail with more value. Since we are tech guys, factual, how do we evaluate it? From search query to sold product, we can follow the number of zero result pages displayed, click through rate or CTR where we want to see if our result page led to at least one product page view. Deeper on the transactional side, we follow the add to cart rate and of course search allocated turnover, to differentiate appealing only results of results matching user expectations enough to lead to a transaction, indeed at the end we need to sell the product, duh. Most of these metrics are on the business side, moreover search behavior is the quickest shopping tool (including product navigation and product recommendation) to react to tendency or any offer issues. Considering these two points, our tech goal here is to build a high frequency monitoring system, with well plugged triggering to ease further adaptive/corrective actions, time is money when we are on quick tendencies or main offer issues.

After a few months our internal search engine looks good, robust and everything. We are done! But a dear colleague comes in the room and say : “I’m sure if someone looks for a shovel today, he will have more chances to look for ‘shovel’ too tomorrow than any other word!”

Here comes the hybridation, you may be familiar with it if you already know product recommendation stakes. There is the product side (semantic) and the user side, merging both with your own possibilities and needs, can make your search engine a perfect fit to your search audience :)

Side search features

I’m done with the search engine overview, but we can mention a few words about global search experience.

Search on a website is there to find what the user wants, but basically it’s the same with site navigation. The difference with search is flexibility, it is here to fluidify and shorter user journey on the website, as well as enable horizontal navigation where website built in navigation is vertical.

Search engine is not the only search product to be part of the search experience, we can mention autocompletion engine, which help user to input its query (As a developer this feature is a deadly requirement :) ) and Related Search recommendations which help the user to keep is navigation on the website when the displayed results do not match what he was looking for.

Both the features will be presented, with step by step implementation in next articles :)

Too long, wrap it up please

Search became a reflex behaviour for every internet user within the past 10 years, and thanks to Google, something not so easy, looks now basic for the end user. Building a search engine for your website will cost you, and won’t be easy, but still you need to have it no matter what, it is now a basic requirement. Have fun and be proud, you made it :)

--

--