Profile Builder | Machine learning and fashion in 36 items

Kallirroi Dogani
ASOS Tech Blog
Published in
7 min readJun 10, 2019

The Profile Builder is an AI-based solution from ASOS allowing customers to provide explicit information on their style preferences. This information enables us to create a personalised experience for our customers and provide more relevant recommendations.

The main idea behind Profile Builder is that we present customers with an array of products from our catalogue and ask them to select those that they like. Clearly, customers will not be willing to look at thousands of individual items as part of this process, so we must select a small representative sample to be used in the Profile Builder. So, which products should we show to customers to maximise what we learn about their preferences?

Why we need AI

Our aim is to build an automated system to select the items we show to customers as part of Profile Builder. We want to regularly update these items as our catalogue changes, so that they reflect our latest offerings as well as seasonality. Manually updating these items every few weeks would be very time-consuming, so it makes sense in the long term to build an algorithm that does this for us. We base our algorithm on our pre-existing recommendations system. This system is designed to predict which items in our catalogue a customer will like, based on what they have viewed and purchased in the past.

Getting it done

We decided to group most of our catalogue into five categories (Tops, Bottoms, Outerwear, Shoes & Accessories and Dresses & Jumpsuits/Polos & T-shirts) for womenswear and menswear respectively, and show customers a selection of 36 products per category. The goal is to cover as many different styles as possible and at the same time to ensure that the items shown are informative enough and able to improve our recommendations.

Methodology

There are multiple ways two products may be similar: in terms of colour, style, occasion, season, shape, price etc. Finding a similarity metric that covers all these possible aspects is not an easy task. In the context of Profile Builder, it’s sufficient to argue that the more shared attributes a pair of products have, the more similar they are.

In order to find the 36 products for each category, we use a machine learning technique known as clustering. Clustering performs a grouping of a collection of products so that they are in the same group (called a cluster) — these are more similar to each other than to those in other groups (clusters). By creating a grouping of 36 different clusters, and selecting one product per cluster, we can produce our desired list of 36 products.

But what do we mean by similar products? In other words, how can we distinguish a pair of jeans from a skirt or one shirt from another? ASOS maintains a database of attributes, e.g. product type, brand, colour etc. You can imagine a product as a list of attributes (or features), something that makes complete sense and is very common in traditional ML (machine learning). However, this is not the case in Profile Builder. Our goal is not to just capture a diverse selection of products, but to discover what items are representative of the customers’ tastes. What do the customers buy? For example, we may have a great selection of products with different colours, cuts and designs, but we don’t know which of them capture best the customers’ needs. Do customers prefer plain T-shirts or patterned clothing? Using solely the product attributes is not informative enough, as it doesn’t give us any insight about the customers’ preferences. For that reason, we’ve decided to represent each product as a vector obtained from an algorithm called collaborative filtering.

Collaborative filtering learns a vector for each product based only on customers’ interactions with products. Customers that purchase the same items often tend to have similar taste. Collaborative filtering also takes into account the views, products added to bag or any other information that has high predictive power. Although there’s no clear interpretation for the product vectors, previous studies have shown that they do capture information about data attributes (in our case colour, shape, price etc.). Therefore, vectors that are close to each other in our learnt space could be interpreted as similar attributes, which implies similar products.

Clusters of womenswear Tops in 2D. Products of the same colour belong to the same cluster.

Does it really work?

In order to confirm the assumption that similar products are grouped together, we looked at the products of different clusters and tried to understand the connection between them. What we discovered was that clustering has worked impressively well, with the majority of the clusters being quite easy to interpret. Below you can see the products closest to the centroid for different clusters and also their semantic interpretation.

Black leather jackets

Blazers

Parkas

Leggings

Floral dresses

Dungarees

Most of the above groups are big categories and clustering has worked as expected. However, what’s really impressive is that the model was able to capture subtle details in different sub-categories (e.g. denim shorts vs high- waisted shorts). We’ve also noticed that product vectors can even capture information about the brand (e.g. adidas activewear vs Nike activewear).

Denim shorts vs high-waisted shorts

Skinny jeans vs wide-leg cropped jeans

Cropped cami tops vs long cami tops

adidas activewear vs Nike activewear

Maximise information gain and user experience (UX)

Having confirmed that clusters represent different styles, the next challenge was to find which product to pick from each cluster. Trying to maximise the information gain from the selected items, our first intuitive idea was to identify the most popular item per cluster. Our evaluation showed that picking popular items was the most efficient way to boost the performance of recommendations for new customers. However, the visual outcome was certainly not what customers would expect to see. Most of the items were just black or white, offering a very poor UX.

Most popular item in cluster

Selecting the item closest to the centroid offers a much more desirable outcome in terms of UX, but at the same time the quality of our recommendations reduced significantly.

Item closest to the centroid

Trying to tackle the trade-off between information gain and UX, we came up with a simple solution that’s in the middle of these two extremes. We need to keep the popular items and just introduce a bit of randomness by going closer to the centroid and then picking the most popular item.

Most popular item close to the centroid

This hybrid method creates a much more colourful and diverse outcome and keeps the performance of our recommendations for new customers relatively high.

What’s next?

The products that are currently shown in Profile Builder are static and customers interact with them once. To fully exploit all the capabilities Profile Builder can offer, we want to transform Profile Builder into a dynamic tool where the products are automatically updated following the changes in seasonality and fashion trends. Finally, collecting other types of information (e.g. sizes, favourite colours) can help us build an improved personalised experience, increasing customer engagement and satisfaction.

This is a shared piece from Kallirroi Dogani and Stephen Hardwick, Machine Learning Scientists at ASOS.

--

--