Image for post
Image for post

Recommendations I — A scary tale about vanishing ghosts

Emilio Carrión
Aug 26 · 7 min read

Much of the online shopping experience today is enhanced by opportunities to explore new products. At MercadonaTech we have a centralized system called recommender that calculates different types of recommendations oriented to a wide variety of areas to provide you the best purchasing experience.

We have the vision to build a platform that allows the customer to do their groceries as fast as possible and save their time to spend on something more interesting. Our goal is to be able to do the purchase for you, and so, the groceries would just show up magically at your home ✨. Recommender is one of the first steps in that direction.

In an effort to show some of our product adventures (let’s call them tales from now on) and how we deliver recommendations value to the customer in a lean way we wrote four articles talking about some interesting case studies that we pulled out (they are really interesting believe me).

In this first article we will talk about how a logistic problem that affected thousands of customers every month was solved with a simple implementation that brought quick value and a great purchasing experience improvement.

Be comfortable, grab some popcorn 🍿and enjoy the trip.

Ghost products 👻

Image for post
Image for post

This implies that customers can have discontinued products in their active cart. Products that are cleaned out in the checkout process due to their current status. The client added them to their cart and wanted them but due to catalog changes they are not available anymore. This flux cause cart lines to disappear between the phase of building your cart (that can last from days to weeks because of the nature of our ecommerce) and the phase of creating the checkout. This vanishing products then are acting as a spirit disappearing the morning after Halloween 🎃. We call these entities ghost products (so spooky).

This is a double edged sword. On the one hand we are losing potential sells and the other hand (and most important) we are not delivering the customer all the products they want and they think they have bought. That is a big problem.

First steps 🐣

Image for post
Image for post

This alert allowed the customers to be aware of the incidences in their carts and have an opportunity to manually substitute the discontinued products by similar ones from the current catalog.

This quickly contributed to reducing the number of ghost products that were being cleaned out in the checkout process, bringing great value to both the customers and us (yaay!).

With only this change we reduced the ghost products arriving to the checkout process on an amazing 60%!

The not so lean detour 🚵

Our plan was to recommend customers products that were similar to the ones that were discontinued, sorted by similarity and with ease to be swapped with the troubled ones. Wasn’t that an amazing plan? 🤯

In a short break from our lean adventure, some developers (and a Product Manager 🤷🏻‍♂️) went to the BigThings conference in Madrid. That gave us some free time between talks and in our train journey back to make a short but intense hackathon where we tried to face the problem and experiment a little bit with reinforcement learning.

We developed and deployed a funny website where we were shown two products from the same category. Then we were asked to confirm if we would substitute the first with the second or not. The idea was that with enough data and a clever algorithm like the multi-armed bandit we would be able to score and suggest similar products to one given.

Spoiler alert, was funny but not so useful (yet 🤔). The result of the experiment was not to have the required amount of data to build an usable system. Having +9000 products in our catalog did not make it easy but it was fun to share with our colleagues at the office and sort some products together. As an incentive it shown a random meme from our internal office meme pool (yes, we have one) every time you interacted with it so at least we had some laughs.

Back to work 💪

We had some clues about how to tackle it. We knew that we only had some basic info about the products (topic that we will talk about in a couple of tales) but it seemed enough for this task. We had available data like the name, the price and the category, so we went easy.

We implemented a trivial scoring system that made use of the levenshtein distance (that calculates the number of differences between two text strings). Two items would be similar if the levenshtein distance of their names was lower than a threshold. Sorting all similar candidates by ascending order got us acceptable results.

With some UI love from our wonderful designer and our beloved Product Manager we finally got a tool for the customers to substitute discontinued products with similar ones that would do the work.

Image for post
Image for post

But this was only an alpha. By the time this implementation reached production we already had outlined a stronger, more robust (and again really trivial 😅) algorithm to rule all future substitutions.

This new algorithm groups similar products tokenizing and comparing the name of the products. The tokenization process is really simple. It removes the brand and stopwords (irrelevant words like “the”, “and”, etc) then converts all words to lowercase and creates a set from them. Then calculates the size of the intersection between the sets of tokens that compose the different products inside the same category.

Image for post
Image for post

In the example above the products’ similarity score would be 2, as the intersection of tokens has a length of 2.

By ordering the products according to the number of tokens they have in common and applying a threshold, we manage to show products similar to the discontinued ones. So awesome and simple.

With this trivial process and some infrastructure magic that you will read about in the fourth part we have managed to deliver great value to our customers and to our dear ecommerce.

To give some results we currently have a 28% conversion rate. That means that nearly 1 every 3 products that are checked to be substituted are swapped by a similar one.

Next steps 👣

Conclusions 📓

This tale about ghosts and forgotten products has taught us that we are able to bring real value to our customers with little changes that slowly accumulate to make our product what it is today.

Unfortunately this tale finishes here, but don’t be sad! This recommendations trip doesn’t end yet, we have three more wonderful tales to tell and you are more than welcome to keep going with us in this amazing journey.

Coming soon 🕒

Mercadona Tech

Somos el equipo detrás de la nueva experiencia online de…

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store