Getting personal at ASOS

Published in

ASOS Tech Blog

6 min readOct 26, 2022

ASOS is the destination for fashion-loving 20-somethings. Through market-leading web and app experiences and a selection of nearly 900 brands, as well as fashion-led own-brand labels, ASOS serves 26.7 million customers in over 200 markets and in ten languages. The AI team at ASOS is a cross-functional team of Machine Learning Engineers and Scientists, Data Scientists and Product Managers, that uses Machine Learning to improve the customer experience, improve retail efficiency and drive growth.

Personalisation at ASOS

With a catalogue containing around 100,000 items at any given time — and with hundreds of new items being introduced every week — our personalisation system is critical to surfacing the right product to the right customer at the right time. We leverage machine learning to help our customers find their dream product by providing high-quality personalisation through multiple touchpoints during the customers’ journey.

For smooth shopping experience, personalisation needs to be provided with sufficiently low latency. Our system also needs to be robust and scalable throughout the year, with multiple sale events driving high traffic to our website.

In this blog post, we offer you some insight into our personalisation system and how this specific design helps to fulfil the unique needs of a global online retailer.

SCENE: A WORLD WITHOUT PERSONALISATION
Hero: “… wavey garms! Where for art thou?!!!!”

Personalisation with Deep Learning

Through personalisation, our aim is to optimally rank products to increase customer engagement, which we measure through clicks and purchases. We adopt a two-step approach to do this. As a first step, we use deep learning to embed products and customers in a latent space that is meaningful for our ranking task. This process uses a huge amount of customer interaction data which results in embeddings that capture rich user and product information. In a second step, these core embeddings are then employed in multiple, separate downstream ranking tasks. Let’s have a look at each step in a bit more detail.

Step 1: Extracting meaningful embeddings

We capture interaction data from multiple sources such as customer views, purchases, add-to-bags and save-for-laters. This interaction data is passed through a neural collaborative filtering algorithm [1] to transform it into embeddings for each product and customer. A key property of these embeddings is that similarity in the embedding space (according to some similarity function) encodes similarity between products or a preference between users and products.

We also want our system to be able to recommend one of the many new products that are added to ASOS’ catalogue every week. To address this cold-start problem, we generate product embeddings by passing product features such as textual descriptions and images through a deep neural network with a series of dense layers.

For both collaborative and content-based models, we build the customer embedding by aggregating the vectors of the products they have interacted with. This considerably reduces the number of model parameters during training.

Set-up for calculating similarity scores between customer and product embeddings. The product embeddings are learned either through neural collaborative filtering or content-based filtering. The customer embeddings are an aggregation of the interacted product embeddings.

Step 2: Putting the embeddings to work

Personalisation everywhere. From left to right: Category page, “You Might Also Like” carousel on a product page, personalised carousels on the “For You” tab on the homepage, and Search page.

With our product and customer embeddings we can provide personalisation for customers at multiple points in their browsing journey. We can use our understanding of customers’ preference in many ways as they navigate the site.

When customers arrive at the app our “For You” tab holds multiple personalised product carousels — such as “We found these for you in the sale” and “Similar vibes to your saved items”. Each personalised ranking is generated by ordering by the similarity score between the customer embedding and a particular set of candidate product embeddings, with each carousel having a different rule for selecting those candidate products.

On category and search pages, customers view a set of products that match some criteria (the category or search term). This set is again ranked according to the similarity with the customer vector.

When a customer views a product page, we take another opportunity to recommend products to the customers with our “You Might Also Like” carousel. Here we take a weighted average of the customer and hero product embeddings before calculating the similarity of this average embedding with a set of candidate product embeddings. (The candidate set is generated with business rules, in this case products of the same category are selected.)

Personalisation is key to making these browsing experiences engaging and efficient for the customer. On category and search pages which can contain 1000s of products, personalisation reduces the scrolling or filtering effort required by our users. In our “You Might Also Like” carousel we only have 16 slots to make a quality recommendation — again, this is only achieved via personalisation.

Our Tech Stack

To support the multiple personalisation touchpoints across our web and apps, we have built our personalisation system using state-of-the-art technology. A high-level overview of this tech stack is shown below. The customer interactions are processed on a Databricks cluster using PySpark and stored in Azure Blob storage. We use TensorFlow 2 and Azure Machine Learning to set up the training pipelines for the neural collaborative model and the content-based neural network. The learnt customer and product embeddings are again stored in Azure Blob storage. Each of our downstream personalisation tasks fetch these embeddings and builds serving models with Tensorflow Serving and NVIDIA Triton.

Fast iteration cycles and high scalability

Our personalisation system fulfils the unique requirements we have as a fast-paced online fashion player. We have built our downstream personalisation touchpoints upon using a shared set of customer and product embeddings. This makes these downstream tasks significantly more lightweight. As machine learning scientists and engineers, this sets us up for faster iterations, aids our culture of experimentation [2] and results in faster growth. Having a shared set of embeddings also makes it straightforward to introduce additional downstream touchpoints, supporting the scalability of our personalisation efforts.

With a large customer base and multiple sale events driving high traffic to our website, our personalisation system is built in such a way that generating optimal rankings is highly scalable with low latency. At inference time, the similarity score between a customer and/or product vector and a candidate product set can be calculated very quickly (especially compared to passing inputs through an ML model at call time) and this meets the stringent response time requirements placed on us by the business to facilitate a seamless shopping experience.

Tackling challenges and levelling-up

In our efforts to tailor the customers’ browsing journey as much as possible, we face a great deal of challenges and have fun trying to find cutting-edge solutions. Some of these challenges include biases in the data (for example presentation or positional biases in the interaction data) and matching offline and online evaluation results. To deal with these challenges and to increase personalisation performance, we keep up to date with current state-of-the-art via literature reviews and by attending and contributing to academic conferences. Some active research areas we are digging into are:

Enriching our embeddings by including side information or using graph neural networks to exploit customer-product interactions in a novel way.
Improving the downstream task performance with session-aware models, size and fit-aware models and fine-tuning text models for search information retrieval.
Increasing the footprint of our personalisation offering, for instance using reinforcement learning for personalised discount codes.

FINALE: PERSONALISATION HAS BEEN DEPLOYED AND ALL IS WELL
Bystander: “Wow! Those are some really wavey garms!”
Hero: “Why thank you.”
CURTAIN

References

[1] He, X., Liao, L., Zhang, H., Nie, L., Hu, X., & Chua, T. S. (2017, April). Neural Collaborative Filtering. In Proceedings of the 26th international conference on world wide web (pp. 173–182)

[2] ASOS wins the organization-wide award at the 2022 Experimentation Culture Awards

This article was predominantly written by Sofie De Cnudde — a former Machine Learning Scientist at ASOS.com. In her spare time she enjoys hiking and board games. This article was co-authored by Jacob Lang — a Machine Learning Scientist at ASOS.com. In his spare time he enjoys cycle touring and community choirs. Thank you to Duncan Little, Eleanor Loh and Dawn Rollocks.