This article is based on a talk that was given at a Women in Machine Learning and Data Science meetup, the 28th of June 2018. The whole presentation slides can be found on their medium page (https://medium.com/@WiMLDS_Paris), and this article will detail the mechanism behind catalog customization.
The author of this article, Betty Moreschini, is a Software Engineer at vente-privee, working in the personalization product team. This team is composed of data scientists and software engineers working hand in hand to tailor the website to fit each particular customer. Betty’s role is to work with models developed by data scientists in order to ensure that those models reach their full operational potential.
On vente-privee’ website, there are about 150 flash sales on the home page on a given day. For a customer to find what he wants, there might be a lot of scrolling involved. At the moment, home page customization is being tested.
The principle of the home page customization is that given a customer’s previous orders, preferences, and so on, we customize the order of the sales on the home page, so that the most relevant sales for this customer appear on top of the page.
For now, home page customization has shown very encouraging results, with the entry rate for sales going up around 1.5%, and conversion rate going up 1.4%. Thus, the personalization team has decided to move forward with catalog customization.
What is catalog customization?
When a customer enters a sale, they arrive on the first section of the sale. For example, when I (a 26-year-old woman) enter the New Balance sale, I land on the ‘Men — Lifestyle’ section.
This might not be the ideal landing page. Thus, the personalization team has been working on a ‘Recommended for you’ section, which would be filled with the most relevant products for a given customer.
How does it work?
Here is an overview of how the catalog customization works.
Let’s see this step by step.
When a customer interacts with the web site, an event is sent to our analytics bus. From there, depending on the type of event, multiple things are going to be triggered.
When a customer interacts with an article (for example consulting the product page, or adding the product to the cart), a few things happen. One of them is that the product aggregator is triggered.
This aggregator’goal is simple: it keeps track of interaction scores between customers and products. So, as can be seen on the image below, when the ‘Member1' clicks on ‘Product1', the aggregator will increment the interaction score between this member and this product.
Again when a customer interacts with a product, the member aggregator gets triggered.
This aggregator’s goal is a little less simple: it keeps track of affinity scores between customers and demographies of products. If a customer clicks on a product that is categorized as a product for adult women, we increase the affinity score between this customer and the demography ‘adult woman’.
With this system, a user can have affinities with multiple demographies. For example, if a father usually shops for himself and for his daughter, he will have strong affinities with ‘adult man’ and ‘kid girl’ demographies. This way, products for him and for his daughter can be pushed among the relevant products for this customer.
When a customer interacts with a product, finally, the popularity aggregator is triggered.
This aggregator will increase the popularity of an item when it is interacted with.
This score is not connected to any user, it just keeps track of the popularity of an item, regarding all the users.
Using collaborative filtering, this algorithm takes the table of affinity scores that the member aggregator keeps updated, and will complete it.
For example here, thanks to our aggregator, we know that Member1 has a strong affinity with Product1 and Product2, but we know nothing about the other products. Using what we know about other customers and their affinities, we are able to fill in the blanks in that table. This will give us the estimated affinity score between any customer and any product.
Every 15 minutes, we update our table of current operations. This is simply to know which operations have ended, or are not beginning soon, so that we don’t compute recommendations for these sales.
When a user signs in, the recommendation aggregator is triggered.
This aggregator uses all the previous scores that we just saw. The scores are incremented when customers interact with products, so on the first visit of a customer, not much will happen. The following description assumes that the given customer already visited the website before signing in again.
The recommendation aggregator will use all the scores previously computed, and will produce a final affinity score between the given member and all the products for current sales.
Scores are normalized to be comparable, and we have a secret formula to choose the impact of demography, affinity and popularity on the final score.
Then, we apply some diversification to this score.
To explain this diversification, let’s take the example of a customer arriving on a clothing sale. We know that this customer likes shirts a lot, pants a lot too, although just a bit less, and then she likes dresses.
If we just use the scores to suggest articles for this sale, we might recommend to this customer 50 shirts, 50 pants, and then 50 dresses. It wouldn’t necessarily be ‘wrong’ because we know that she likes shirts more than pants, but if she likes pants a lot too, it’s not ideal that she has to scroll through 50 shirts to see the first pants.
Thus, the diversification will mix things up. It creates “blocks” of articles, so we will suggest a block of shirts, then a block of pants, then a block of dresses, and so on. The blocks are not going to be blocks of 50 products, so the customer will land on pants faster. However, sizes of blocks are still depending on the affinity between the customer and the type of products, so if she likes shirts slightly more than pants, the shirts blocks will be slightly bigger than the pants blocks.
This diversification step does not impact affinity score, it impacts the order in which we want to suggest articles. Thus, after this diversification step, we have a final rank, instead of a score.
Having obtained the order in which we want to recommend products, we can finally produce our final database!
Now, with this mechanism, when a customer enters a sale, the navigation team will query this final table to obtain the products order that has been computed for this specific customer. The customer will arrive at a ‘recommended’ section filled with these products, instead of arriving on a random section.
The obvious next step is to AB test this catalog personalization, and fine tune it.
After that, the personalization team has several ideas to keep improving the user experience on the website. Among them:
- Recommending similar articles: using the same principle as catalog customization, when a customer is consulting a product page, we can suggest similar products that are better adapted to this specific customer;
- Personalizing order confirmation emails: when a customer has made an order, we can suggest other products that are relevant based on that order;
- Introducing travel recommendations: if we detect that a customer is about to travel, we can suggest customized travel packages on the home page and in emails.
We will be happy to share with you the advancement of our projects in the future articles!