A List on the Homepage

Published in

Tata 1mg Technology

6 min readNov 22, 2018

It began with the paper titled “Two Decades of Recommender Systems at Amazon.com”. The article said, “Each person who comes to Amazon.com sees it differently, because it’s individually personalised based on their interests. It’s as if you walked into a store and the shelves started rearranging themselves, with what you might want moving to the front, and what you’re unlikely to be interested in shuffling further away”. Why couldn’t we do the same for our users?

Henceforth, we began with a vision to personalise the products for each user shown in a list on the 1mg mobile app homepage. We wanted to build a recommendation system that can leverage all users’ short and long term view and purchase history and predict what they would like in future.

The Idea was ‘Collaboration’

Collaborative filtering models are fundamental to recommendation systems which are based on assumption that people like things similar to other things they like, and things that are liked by other people with similar taste.

The underlying assumption of the collaborative filtering approach is that if a person A has the same opinion as a person B on an issue, A is more likely to have B’s opinion on a different issue than that of a randomly chosen person
- Wikipedia

CF is a general concept and there are several algorithms to implement it. One of the popular approaches are based on low-dimensional factor models (model based matrix factorisation).

The idea behind matrix factorisation models is that preferences of a user can be determined by a set of hidden factors. We can call these factors as Embeddings. Intuitively, we can understand embeddings as low dimensional hidden factors for products and users. Each element in the product embedding vector could represent different characteristics about the product and likewise the elements in the user embedding matrix might represent different characteristics about the user. A higher number from dot product of user-X and product-A matrix means that product-A is a good recommendation for user-X.

Collecting Clickstream Data

Our internal data pipeline collects and stores request response logs for all API calls in compressed hourly files in our data warehouse in Amazon S3. This way we have been able to gather users’ clickstream activity on the Android and Web. These files are read and processed using Spark deployed on a small clusters and then stored back in columnar data format which serves as our Tracking Data Store.

Matrix Factorisation — Implicit Preferences and ALS

A typical user might specify their preference for a product by giving it a high rating if they like it or a low rating if they dislike some aspect about the product. Collaborative filtering models are typically built using explicit preferences, where available. We had no explicit ratings for products and hence, we had to look for alternative methods to assess a user’s preference towards a product. In some ways, a user’s purchase of a product can also be seen as an implicit indicator of their preference. We started with a fairly simple implicit preference function which has become more nuanced over time (takes weighted sums of a user’s views/add to carts/purchases of products). A discerning reader can tell that preference functions can be constantly tweaked and improved.

Matrix Factorisation works on a dataset with users, products and user’s preference for the product. We used our data pipeline as described above, to find which user interacted with which product and extracted aggregated interactions for each user-product combination for a period of nine months.

An out-of-the-box implementation of Alternating Least Squares (ALS) technique in MLlib library in Apache Spark has been used widely for solving CF problems. Using a three instance cluster with Spark and Spark’s high-level API in Python, we trained an ALS model with implicit ratings. We evaluated the model by measuring the root-mean-square error of rating prediction.

Ensemble Models

ALS has a number of hyper parameters like dimensions of latent factors, iterations, regularisation etc. which can be tuned to improve the evaluation metric. Through our experiments, we found that an ensemble of matrix factorisation models (labelled as MF — Bagging) was better than individual models.

Deep Learning

The exploration of deep neural networks on recommender systems has received relatively less scrutiny. We came across Neural Collaborative Filtering (NCF) where we trained a Generalised MF and Multi-Layer Perceptron to learn separate embeddings, and combined the two models by concatenating their last hidden layer. This model combines the linearity of MF and non-linearity of DNNs for modelling user–item latent structures. We evaluated the NCF model using Hit Ratio (No. of correct predictions / No. of total predictions) and Normalised Discounted Cumulative Gain (nDCG).

Personalisation in Production

The final output of the modelling exercise was 20 product recommendations for all the users in the dataset. We pushed the new list into production using an AB Framework and appropriate fallbacks to the static list. Both the metrics that we used to track — clicks through rate (CTR) and conversion in same session, jumped up. These initial results were very promising and have pushed us further into improving the personalisation that we have been able to achieve.

***Growth of Orders through the List from Oct ’17 to Feb ‘18***

As we moved ahead, some new questions came up

Were users satisfied with the list, or in other words, were users clicking the recommendations that appear first in the list more often now?
Were users able to discover more items using this list?
Was the list decaying with time?
Were the recommendations changing when we periodically recomputed the recommendations?

Deep Diving into Evaluation

These questions forced us to identify better and more appropriate metrics that would answer these questions and more. Below, we describe some metrics we were able to calculate.

Positions of Products in the List

We took a course on metrics for recommender systems and came up with two metrics which could tell us more about the positions where the clicks were happening -

Mean Reciprocal Rank (MRR) — Reciprocal rank is defined as the inverse of the rank of the first item in the list with which the user interacts (clicks/purchase).
Discounted Cumulative Gain (DCG) — looks at the utility of the item at each position in the list and discounts this utility or gain by position in the list. So we place more emphasis on the items at the top of the list and then we normalise by comparing the discounted cumulative gain of the list the recommender produced versus the discounted cumulative gain of a perfect recommendation list.

Do the Recommendations Change with Time?

Lathia’s Diversity talks about How temporal diversity is an important factor in recommender systems . How the recommendations change when new data is introduced with respect to time.

Lathia’s Diversity Histogram for ~500K users
Over two recommendation lists computed at the difference of one month
(0 means no change in list and 1 means a completely new list)

From a Business Standpoint

Revenue Per Mille (RPM) could tell us how the price value of converted items from the list is varying
Conversions Per Mille (CPM) could tell us how the number of converted items from the list is varying

Do the Clicks on the List Decay with Time?

We analysed the New Users vs Old Users who click on the list on daily basis. We observed that the list was not decaying as there was a set of new users who had never interacted with the list clicking on the products in the list.

What about Cold Start?

Collaborative filtering does not provide any recommendations for new users with less or no tracking history and also does not include new products in the recommendations for users. In our case, we show a static list of products where we do not have any recommendations from the CF model. This problem is popularly known as the Cold Start Problem. One of the ways it can be addressed is by relying on content based vectors for products and profile based vectors for users and using them alongside the embedding vectors in the model to learn preferences of all users for all products.

The Future still holds promise

As we start to dwell further on personalising product recommendations for users, we have many exciting paths to tread. Working out better solutions for tackling implicit ratings for products, addressing cold start for products and users and updating recommendations in realtime could also lead to better click through and conversion and thereby better personalisation.