We all know how valuable personalization is to your digital services. Given that no “one-size-fits-all” approach exists, providing your users with a homepage reflecting their navigation habits, preferences and tastes, instead of generic content, can be a make or break factor of user satisfaction.
In our last blog post, we talked about applying AI to your catalog of images to build a visual search engine. In this post, we will see how to apply the same strategy to personalize your app or Website, that is, how to build a recommendation system based on visual similarity.
Recommendation systems: how they work
A recommendation system aims to ingest the past navigation behavior of a user, and predict the items that the user will be interested in. There are multiple kind of recommendation systems that can be built.
The simplest recommendation system is based on most popular items: you recommend to a user the items that have been most interacted with over the last few hours or days. For instance, the journal articles that have been most read or commented, or the items that have been bought most often. You can add a tiny-bit of personalization (sometimes referred to as “contextualization”) by recommending the items most popular in the vicinity of your user.
Recommendation systems based on most popular items do not provide real personalization as they rely on aggregated information, but they represent a very efficient strategy to make your app or Website “smart by default”.
More advanced recommendation systems are based on considerations of similarity between users, between items, or both. Similarities can be based on past user-item interactions, or on what we call “metadata”. The following table gives some examples of these different strategies.
Other classes of recommendation systems can capitalize on sequences of item interactions: for instance, there is some value in knowing that a user has bought a soccer ball after having interacted with a hockey stick, and not the other way around. These sequences can be used by training a recurrent neural network, a very interesting topic, but for another post :)
Capitalizing on your images
Whatever the items you are offering (products, activities, sports, or events), chances are that you have a catalog of images describing them. Using transfer learning, we can translate these images into a feature vector, describing the components found in the image. By calculating the distance between these vectors, we can determine, for each pair of images, how similar they are. Finally, we can rank these similarity measures and, for each item, find the items in the rest of our catalogue which look most similar.
This metric of similarity between images is all we need to build an item-based recommendation system! Now, the question remaining is, how good is a recommendation system based on visual similarity? In other words, if we assume that a user shops first and foremost with his eyes, is it a powerful idea to assume that he will be interested in the future by items visually similar to those he has interacted in the past with?
It’s all about precision, recall and coverage
Let’s assume that we want to suggest five items, personalized to each user, on our homepage. Our objective is to maximise the number of users clicking on these suggestions.
During the development of a recommendation system, we generally verify the quality of our work by hiding from our model the most recent user-item interactions: for instance, we can hide from our model the user-item interactions which have happened over the last month. These interactions thus represent the “future” for our recommendation system, and their comparison with the suggestions given by the recommendation system represents a great indicator of its quality.
More specifically the quality of the recommendations is generally assessed from three main metrics: precision, recall and coverage. Precision is the fraction of recommended items that our users have indeed interacted with during the last month. Recall is, to some extent, the opposite: it is the fraction of items that users have interacted with during the last month that can be found back in the recommendations. Finally, coverage is the fraction of items in our catalog which have been recommended to at least one user. For all of these metrics, the higher the better for our recommendation system!
How does our recommendation system perform?
It’s now time to put our recommendation system based on visual similarity to the test! To do so, we extracted the last ⁓600 000 user-item interactions on decathlon.ca. In this case, a user-item interaction represents an instance where a user clicked on one of our > 7000 products in our catalog.
As explained in the previous section, we hid the most recent 60 000 user-item interactions from the model, and used them afterward to assess the quality of our recommendations.
For each of our user, we extracted the list of items they had interacted with. For each of these items, we extracted the 10 items in the rest of the catalog that look the most similar, as described in our blog post on visual search. Finally, we used a quick and dirty scoring metric to identify the top five items that look, on “average”, most similar to the items the user has previously interacted with. An example of this scoring system is given in the following figure:
We compared our recommendation system to a recommendation system based on the most popular items. The precision, recall and precision values are as follows:
As you can see, using visual similarity in the recommendation improved by a wide margins all three performance metrics. In total, 1.9% of the items suggested by our recommendation system were indeed consulted by users over the last month, which is pretty good, considering that the catalog contains more than 7000 items. About 2800 different items have been recommended to at least one user (37% coverage). Overall, the results confirm that visual similarity is a good indicator of the future items that a user will be interested in.
To provide an example of recommendations, here are the five items that were last consulted by a user, selected randomly:
The first item is a portable pillow that can be used during hiking trips; the following two are male leggings, the fourth is a gym shirt and the last is a pair of running shoes. These items suggest that the user practices outdoor activities such as running and hiking, and has a preference for darker items.
The top three items recommended by visual similarity are the following:
The first item is an upscale version of the same male legging previously looked at by the user, the second item is a similar gym shirt (but with a V-neck, instead of a round neck), and the last item is a portable sleeping bag, which could be a nice complement to the portable pillow previously looked at.
In comparison, the items recommended using the most popular approach are the following:
While these items are all trending online and could be of interest to our user, these items are much more generic, and do not necessarily reflect the specific preferences of our user.
In the end, a recommendation system limited to visual similarity is far from perfect. At the very least, it should be complemented with some carefully selected filters to ensure that the recommended items are consistent with respect to age and gender, and are in stock. It is also expected that adding signals on top of visual similarity (similarity between users, textual description of items, modeling user-item interactions using recurrent neural networks) would significantly improve recommendation.
However, if you want a quick and scalable recommendation system that will beat recommendations by popularity, visual similarity should score high in terms of rewards vs investment for your online business.
We are hiring!
You are interested in transfer learning and the application of AI to improve sport accessibility? Luckily for you, we are hiring! Follow https://developers.decathlon.com/careers to see the different exciting opportunities.
A special thanks to Cloé Larivière-Jeannotte and Guillaume Simo, from Décathlon Canada, for the comments and review.