Product recommendation with implicit ratings and anonymous users.

Published in

Siggy Recommender

7 min readDec 22, 2021

The classic collaborative filtering algorithm is useful when you have a group of specific users that explicitly provide ratings to many items. This is challenging for many online stores because:

A user may shop anonymously without creating a user account.
Users do not provide ratings or reviews of products.

This post will discuss how you can apply the collaborative filtering algorithm under these constraints. If you are interested, read on!

Our machine learning pipeline

We have a very simple overview that shows the key parts of the machine learning pipeline.

Data processing — This should be run frequently to stream and process the user visit data and store them as rating data.
Machine learning — This should be run periodically (e.g. once a week) to (re)train the data.
App/Presentation — The frontend application will use the KNN generated from the last trained algorithm.

Generate implicit ratings

When users are not providing explicit ratings to products in a online shop, we will have to derive this from user behaviors. Some of the key user behaviors can include:

Time spent on a product page — Longer time spent infers the user is interested in the product by reading the product information, etc.
Secondary actions — Some actions such as adding the product to a shopping wishlist or watching the product when it’s out of stock can also signify users’ interest level.
Primary actions — Actions such as adding the product to the cart or purchasing the product (checkout) can be very good indicators of the user’s interest level in the product.

The next step is to create some rules based on selected user behaviors and generate an implicit rating for the user/product.

For example, a very simple rule can be (For a rating scale of 1 to 5):

If the user spends more than 2 minutes on a product page, we give them a rating of 5; between 1 and 2 minutes a rating of 4, less than 1 minute a rating of 3 (to denote an average/neutral rating).

The time spent should be based on the overall distribution of time spent on product pages (e.g. 2 minutes is 2 standard deviations from the mean, etc).

You can also create complex rules to generate implicit ratings based on a combination of user behaviors. The best approach is to test different rules and measure the effectiveness of the recommendation.

For technical implementation, anonymous user behavior data can be obtained from analytics software such as Google Analytics, Adobe Analytics, Matomo, etc

Process ratings

Depending on the amount of traffic your online store receives, you can stream the visitor traffic and generate ratings in real-time or fetch them periodically (e.g. every 30 minutes).

If your store is seasonal (e.g. Holiday products, the Spring collection, etc), you can also consider pruning the rating data periodically. This can potentially make the recommendation more dynamic and reflect the seasonality of the user behavior.

It is only worth considering data pruning in this example since we will not be predicting the top product recommendations for a user. In that case, more data is helpful to improve the accuracy of the algorithm.

Algorithm and Data training

At this point, we should have a rating table consisting of at least the user id, product id, rating (1–5).

Sometimes it is helpful to keep additional data columns (such as human-understandable product names) to quickly experiment and do sanity checks on the recommendations.

To train the data, we can use the python Surprise library. It is a very simple, easy-to-use library with many collaborative-filtering algorithms.

Let’s look a simple example:

from surprise import Dataset, Reader, KNNWithMeansreader = Reader(line_format='user item rating', sep=',', rating_scale=(1,5))
data = Dataset.load_from_file('/path/to/ratings.csv', reader=reader)
trainingSet = data.build_full_trainset()sim_options = {
  "name": "cosine", 
  "min_support": 3,
  "user_based": False
}algo = KNNWithMeans(sim_options = args)
algo.fit(trainingSet)

Here we are essentially doing the following:

Load our rating data from a CSV file into the training set. In a production environment, you can also load the data from a Pandas dataframe.
We build our TrainingSet using the entire dataset. We do not need to split the dataset into train/test since we do not need to run prediction and measure our algorithm’s accuracy.
We specify a list of reasonable options for our algorithm.
For example, use cosine distance (instead of msd) to measure the similarity between the products.
The min_support is the number of common items between users before we consider them for similarity.
user_based is False to denote we will be using item-based collaborative filtering.

We chose item-based collaborative filtering because a user will likely only rate a few items. Since we view each visitor as a unique user (anonymous), we will end up with a lot of users that rate a small part of the product catalog. On the flip side, an item can be rated by a lot of users. The item-based approach means working with a more densely populated matrix.
Generally speaking, item-based collaborative filtering works better when you have more users than items.

Here we use a KNNwithMeans algorithm. It is a memory-based collaborative filtering algorithm that works fast. SVD and other model-based algorithm are more accurate if you are running predictions.

Find KNN and present recommendations

After training the data, the algorithm will calculate a similarity matrix. This will be a matrix with a shape of (n,n) where n = the number of items (products) in the trainingset. It will contain the cosine distance of all products to all other products. (The matrix should also have a diagonal of zeros since the product’s distance to itself will be 0)

Example of the similarity matrix subsection with the diagonal of zeros (0)

On some implementations, the diagonal of the similarity matrix will be ones (1), that’s using the similarity factor (1- the cosine distance)

Our goal will be “given a product, find its KNN (K Nearest Neighbors).” Another way to say this is give me the products that are most similar to a given product (based on user ratings).

k = 4 # We are looking for the top 4 similar products
sim_list = [] 
knn = [] # contain the product ids of the k similar products product_id = '123456'
innerId = trainingSet.to_inner_iid(product_id)# The exclusion id sequence, if the KNN is 0 to 4 means we are not getting a valid KNN
ex_id_seq = {0:0,1:1,2:2,3:3,4:4}
if innerId in ex_id_seq.keys():
  del ex_id_seq[innerId]# Finding the k nearest neighbors using the similarity matrix
neighbors = algo.get_neighbors(innerId, k)
for iid in neighbors:
  sim_list.append(iid)# Check to see if the sims list contains the exclusion id sequence
check_default_ids = set(ex_id_seq.keys()).issubset(sim_list)if check_default_ids:
  # Find the position (index) within the sim_list
  first_ex_id = list(ex_id_seq.values())[0]
  sim_list = sim_list[:sim_list.index(first_ex_id)]for id in sim_list:
  knn.append(trainingSet.to_raw_iid(id))

There are a few quirks in finding the KNN:

The algorithm stores the user_ids and product_ids as innerIDs so you will always need to perform a conversion such as

trainingSet.to_inner_iid(product_id)

When the algorithm fails to find the k nearest neighbors for a product, it will always return the first k items from the dataset (0 to k-1). We look for this sequence and make sure to not include it in our list of recommendations.

Finally, an example recommendation using this algorithm looks like this:

Other observations

The algorithm is highly dependent on having an adequate amount of traffic to the online store, therefore it will not be effective if the online store receives little traffic (e.g. new stores).

The algorithm will likely only provide recommendations to a subset of the product catalog. This means less popular products or new products that receive very few ratings will not show up on the recommendation.

It’s always better to build a recommendation system using a combination of algorithms. For example, using product data and content filtering can help complement the recommendations from collaborative filtering alone.

Next steps

The approach we have taken so far focused on the recommendation scenario of “given a product, show me N related products based on user’s implicit ratings”.

We can better take advantage of collaborative filtering algorithms by making predictions of the top product recommendation for a given user. This means “given a user, show me the top N products recommended for this user.”

To accomplish this, we will need to:

Capture and generate live ratings for a user currently browsing the online store.
Find the most similar user based on the implicit ratings of the current user from the previously trained algorithm.
Make the top N prediction for that “most similar user” and present it to the current user.

In this scenario, it is worth experimenting with different machine learning algorithms such as SVD and others to optimize the prediction accuracy.