We are two Data Scientists at ManoMano, the leading home improvement e-commerce platform in Europe (1M users a day and 420M€ in business volume last year). We went to Copenhagen for RecSys 2019 and had a blast! In this post, we will present the ideas we were able to connect to our current problematics and believe might end up in production on our website.
Sanity checks for a more robust architecture
Spotify gave a very nice talk on homepage personalization and how they tackle prediction diversity in production. When users have unique recommendations, it is impossible to manually check all of them to make sure models are not going south.
An elegant solution to this is to develop sanity checks, meaning statistical rules that need to be respected on the set of all homepages. For example, the chill piano playlist needs to be displayed more often in the evening than during the day.
This talk made us realize that we have no such process. Our architecture is full of functional and load tests that prevent the display of our engines from falling. Their content however has no automated safeguard.
Well things are changing and sanity checks are on their way at ManoMano :) This project will consolidate our data architecture, and we are very happy to lead it in collaboration with our beloved Web Developers and Site Reliability Engineers!
Enrich implicit feedback
We had a big epiphany about digital content companies like Twitter and Spotify: their product IS data. As retailers, we have access to very limited feedback on our products after they are purchased (basically the occasional ratings and angry calls to our customer service). On the contrary, digital companies can leverage information on the product usage. Good examples were presented at Recsys:
- Microsoft created an after-purchase satisfaction score for their games, based on the time and frequency the game is played. Despite being derived from after-sale features only, this score helps them predict more accurately other games purchased by the player.
- Spotify considers listening to a song as positive feedback only if it is listened to for at least 30 seconds.
These talks got us thinking. Although we have very little feedback after the purchase, there is still a lot of valuable but unexploited implicit feedback in how our users interact with our website. Today, we use purchases and clicks as positive feedback in many algorithms. One promising idea is to refine these clicks with page view duration: if a user spends less than 2 seconds on the product page after a click, maybe the click was actually not that relevant!
New implicit features like duration of page views or number of clicks to complete an order have not been investigated as of yet at ManoMano and will start contributing to our recommender systems soon :)
Remember that UX is the goal
Deploying a first-class algorithm tailored to answer a real problem and optimize the right metric is not enough to make a high impact. We also need to think of the way users interact with our algorithm.
Amazon has been suggesting search query auto-completions for years! Only recently they have improved their User Interface on high-confidence queries with ghosting, i.e., “highlighting the suggested text […] within the search box” besides displaying suggestions in the drop-down list. Results? Suggestions acceptance raised by 6%, making misspellings among searches drop by 4.5%!
Fellow Data Scientists, please do not disregard this conclusion: an improvement on UX design can bear more fruits than one year of Data Science Research & Development. To reap great results, Data Scientists must work closely with User Research, Product and UX teams to think thoroughly how users will interact with their new feature.
An emerging subject, standing right at the frontier between UX and Data Science is prediction’s explainability.
Imagine you go to a DIY store, and the salesperson tells you “I recommend this drill and this battery”. Maybe you trust the salesperson, so you believe what they tell you, but what if the advice came from an algorithm? Would you trust it? You’d have no reason to, and we find it stupefying that most recommender systems still do not explain why they showcase a product over another!
We think we must move towards a world where the algorithm tells you “You should buy this drill, because it suits your needs and is our best-seller, as well as this battery, that is compatible with the drill and has a great autonomy”. By doing so, the algorithm will prove and explain the validity of its suggestions, hence bringing trust to the table.
Groupon understood this, and began to display the origin of their recommendations in the title of the carousel. They kick-started this idea with 58 different messages, enabling them to replace generic explanations such as “Recommended Deals for You” by contextualized messages such as “Chinese Restaurants Nearby Your Place”. Results? Clicks on recommendations soared by 50%!
Once again, the conclusion is clear: users react more favourably to messages they understand and that help them navigate. To achieve great results, suggestions should not be delivered in a black-box fashion. Explanations should be part of the UX, and we are thrilled to think we will see it more and more in the future.
Overall, it was great for ManoMano to attend such a high-level event, giving an overview on both industrial and academic challenges regarding recommender systems. RecSys made many questions spring to mind:
- Are our data experiments controlled by sanity checks?
- Have we really exhausted all our sources of implicit feedback?
- Do we spend enough time refining our UX instead of our models?
- Is explanation a sufficient part of our recommender system UX?
These questions will be at the root of our incoming tests and hopefully, will lead to improvements for ManoMano users. Thanks a lot to RecSys organizers, we will come back!