TDx Recommender System

Flavio Silva
TDx: Technology & Design
2 min readAug 1, 2016
Finding the right products to suggest to a user is no easy task

Over the past weeks we worked hard on developing an Alpha version of a Multi-Modal Recommender System.

This new system should recommend products that:

  • Are relevant based on products you previously bought;
  • Are relevant based on your product view history;
  • Are relevant based on your favorite topics;
  • Are relevant based on product tags, such as topics, roles, and subject;
  • Are relevant based on your product type history (are you more of a publications buyer? Print or Digital?)
  • Are relevant based on the price range of the item;
  • Are relevant based on your location, when it comes to meetings;
  • Are relevant based on the product page you’re currently visiting;
  • Were not already purchased by the customer;

If a customer is not authenticated, the system should return popular items, or use other inputs — such as the current page you’re currently on.

Team TDx added granular control over the impact of each of these biases, so we can A/B test them further and decide the best mix of weights.

In addition, we now allow stakeholders and developers to boost or filter recommendations to certain product types, supporting different recommendation use-cases.

With this data, we were able to build an Alpha version of a Recommended For You page:

Our next step, later in the year, will be to incorporate similar principles in a “Content Recommender”. A more granular taxonomy should help us achieve a high level of precision.

In techno-speak, we’re aiming for a high precision and high recall. Measuring recall is somewhat challenging, but we should be able to infer and measure precision.

A good definition for precision and recall, in the context of recommender systems, is described below:

Precision / Recall of Recommender Systems

“The precision is the proportion of recommendations that are good recommendations, and recall is the proportion of good recommendations that appear in top recommendations.” Reference

Technology Stack:

  • T-SQL, Go for data munging
  • Mahout, both Spark Item Similarity and Spark Row Similarity algorithms
  • Golang (web service code and orchestration)
  • Elasticsearch (AWS, as the processed data repository)
  • Heroku (hosting)

--

--