Photo by Ben White on Unsplash

How to Build a Business-Oriented Recommendation Systems

Marco Andreoni
Jan 13 · 8 min read

A recommendation system doesn’t have to be a complex black-box to work well. It is possible to build an effective and explainable tool combining just a couple of basic statistical principles and business knowledge.

📋 Article Contents 📋

📍 1. An Overview on Recommendation Systems

  • Deep Learning for Collaborative Filtering
  • Graph-Based Recommendation
  • The Need for a Business-Oriented Approach

📍 2. A Flexible and Explainable Approach to Recommendation

  • Customer Fingerprint
  • Product Fingerprint
  • Customers Clusters
  • Top Products List

📍 3. Final Toughts and Improvements

1. An Overview on Recommendation Systems

Why Recommendations?

“35% of Amazon’s revenue is generated by its recommendation engine”

It’s very likely that you heard this announcement and that’s one of the elements that contributed to create a big hype around recommendation systems: the idea of a “magic” tool which can improve sales up to 35% is so exciting!

Photo by Ramón Salinero on Unsplash

One could easily understand why many decision makers, turned on by this information, wish to take advantage of this powerful instrument in their companies. It totally makes sense, but when it comes time to take action they are held back by some reasons.

In order to understand their feelings and doubts, it should be considered that many different recommendation algorithms exist and the choice of the most suitable mainly depends on four elements:

  • 💡 Clarity of the result for non technical-people
  • 🔮 Effectiveness in finding valuable rules
  • ⚙️ Computational effort
  • 💼 Possibility to add business-specific logics to the algorithm

Deep Learning for Collaborative Filtering

Graph-Based Recommendation

The Need for a Business-Oriented System

  • The lack of control over a black-box tool and the difficulty in understanding why it is performing well or bad
  • The impossibility of adding their business experience into the algorithm to improve the results
  • The need to find hidden rules, or statistically confirm their feelings and intuitions.

Luckily, an interesting solution is given by a third kind of R.S., that balances statistics and business knowledge. This approach requires a daily pre-process of data and that makes it the most intense in terms of computational effort. On the other hand, it permits the definition of complex business-based heuristics and creates an embedding vector with a clear and pre-defined business meaning.

2. A Flexible and Explainable Approach to Recommendation

As a classical recommendation systems, the final goal of the algorithm is to provide a list of “suggested products” to users, in order to increase their purchases and satisfaction.

Photo by frankie cordoba on Unsplash

Generally speaking, the algorithm can be splitted into four logical steps:

  • 👤 Creation of a Customer Fingerprint
  • 📦 Creation of a Product Fingerprint
  • 👪 Creation of Customers Clusters
  • 🧾 Creation of the Top Products List to recommend for each Customer

To put in a nutshell, the products list for each customer contains the top-sold products of her cluster with the fingerprint very similar to hers.

The Scenario

Photo by Rosangela Taylor on Unsplash

The use case discussed in the article refers to a pilot project implemented by a supermarket retailer in the UK.

The goal is to create a list of suggested products for each customer enrolled in the project. Obviously the list must increase the customer spending, without proposing “sensitive items” (e.g. tobacco, health products…) and previously purchased objects.

The method described below is easily generalizable to other retail contexts, since the only prerequisite it exploits is a hierarchical product taxonomy. Actually, products are divided across G=99 classes and each class is subdivided into fewer than 100 subclasses, generating a total of S=2302 product subclasses. (E.g. Petfoods [Class] → Canned Cat Food [Subclass] → Friskies Liver 250g [Product]).

Customer Fingerprint 👤

Photo by Sharon McCutcheon on Unsplash

The first step of the algorithm provides a customer fingerprint, based on her spending habits.

The absolute spending Cₘₛ of customer “m” across all products contained in the s-th subclass is simply obtained by aggregating raw transactions over the previous fixed period (E.g. last three months).

This value should be normalized, in order to find a standard measure of the customer’s interest in each subclass relative to other subclasses. Being C*ₘ = ∑ₛCₘₛ her total spending over the period, the fractional spending is:

This value still needs a processing, since commonly purchased subclasses (such as water or fresh vegetables) will tend to dominate the fractional spending. The solution consists in taking the ratio of the individual customer’s fractional spending in a subclass to the mean value for this subclass taken over all other customers:

So, finally, each customer “m” gets a vector C′′ₘ of S entries, where the s-th element measures the strength of her interest in the s-th product subclass. That’s the customer fingerprint.

Product Fingerprint 📦

Photo by Raymond Rasmusson on Unsplash

The second step of the algorithm constructs the products fingerprint. The main difference w.r.t. the embeddings found by a neural network is that these vectors are business-meaningful by design.

As for the customers, the result will be a vector Pⁱ for each product “i”, where each entry Pₛⁱ reflects the “affinity” between product “i” and subclass “s”. The reason why fingerprints of customers and products are of the same dimension, is that this makes them easier to be compared using standard similarity measures (such as cosine projection, the one chosen by the authors).

Now you are probably thinking “Ok, this makes sense, but how to choose the right value for each Pₛⁱ ?”. The solution proposed by the paper is the following:

The last step of the algorithm will clarify the reason behind these values, but first the term ‘strong association’ deserves an explanation.

The paper exploits the Association Rules method to measure relations between product classes or subclasses. More precisely, just simple associations are computed (containing a single item in both the body and the head of the rule), so that a subclass S₁ is said to be ‘strongly associated’ with the subclass S₂ if the rule S₁ ⇒ S₂ is ‘pretty relevant’.

Again, a clarification is necessary: there is no a marked distinction between ‘good’ and ‘bad’ associations. Reading the paper you will find the combination of support, lift and confidence chosen by the authors, but this depends on the case.

Customers Clusters 👪

Photo by Mike Scheid on Unsplash

Thanks to fingerprints extracted at the first step of the algorithm, it is now possible to group customers into clusters.

The authors stress that these groups are much more useful for recommendation then the previous used clusters (based on purely demographic information derived from questionnaires).

Indeed, thanks to the meaning of customers fingerprints, the result should ever be groups of similar customers in terms of spending habits, no matter which clustering algorithms is chosen.

Top Products List 🧾

Photo by Glenn Carstens-Peters on Unsplash

The last step of the algorithm is the one that actually outputs the products list, and can be summarized as follows:

Consider a customer Alice, assigned to “Cluster 1”. First of all, a list of candidate products for Alice is composed taking the most popular products among other members of Cluster 1.

Then, excluding products already bought by Alice, each product in the list gets a score of its affinity to Alice’s interests (i.e. the similarity between the two fingerprints). Because of the choice of Pₛⁱ values (using association rules), Alice might result similar to products in category she never considered before, and this is exactly one of the keys of recommendation.

Finally, only top-ranked products are suggested to Alice, and if needed we can break the ties using some heuristics (such as ‘always favour the product which ensures the greatest margin’).

Final Thoughts and Improvements

In Quantyca, we started from this baseline to propose custom algorithms, with a much more refined customer fingerprint and the usage of business-heuristics to break the ties and to expand the product fingerprint.

I believe this can be a really effective solution, both because of its explainability and because it can be tuned as you like.

I hope you’ve found it interesting, so let me know what you think and feel free to get in touch on Linkedin! 😄


Quantyca — Data at Core

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade