Recommender Systems: a math-less, code-less guide for the curious
You know that Netflix can keep you watching for hours by capturing your preferences and cleverly updating your personal recommendations. And, more than once, you’ve been upsold on Amazon, because you actually liked the look of the items that other customers bought.
The purpose, and ROI, of building systems that keep customers happy and engaged is obvious and immediate. And that’s certainly not something that can be said of every machine learning project. But where do you start?
Table of Contents
- The Three Approaches
- Solving the Ranking Problem
- Solving the Similarity Problem
- Solving the Collaborative Filtering Problem
- Adding Learning
- Implementing a Recommender System
The Three Approaches
Thinking about the ways you’ve been given recommendations in real life can help explain the three most common ways that recommendations can be ‘generated’ and the distinct scenarios in which they are useful.
The three scenarios are as follows:
- You’re actively looking for something, but have limited experience with that category of thing. Maybe you’re looking for construction companies that specialise in conservatories or a wedding photographer for your big day.
- You have experience with a product or category of products and are looking for similar things. For example, if you’re a jazz enthusiast and are looking to buy albums featuring your favourite players.
- You’re not looking for anything, but something that you’re very likely to enjoy is released. Like when a friend, with whom you share a specific taste, sees a great movie and implores you to see it.
So, what are the parallels of these scenarios in your product or business?
- A customer is browsing, maybe aimlessly. Perhaps they’re brand new to your site or they’re an existing customer looking to broaden their horizons and catch up with what’s new. These people are typically visiting the home pages of a store, or specific product category pages.
- A customer with a history of purchases, views, likes, favourites, (and whatever else) on your site has plateaued in their engagement and you’d like to show them things they might like.
- Your business doesn’t measure engagement by eyeball-minutes, but you do need to let customers know when something they may be interested is released or changed, so they make a return visit.
The things that matter (the features, in ML-speak) in each of these scenarios are also distinct.
In the first, the features are all centred around how (objectively) good a thing is. How many conservatories has that company built? What do other people think about this photographer?
This is an example of a ranking problem.
In the second scenario, the features that matter are the similarities between one thing and another. Roy Hargrove played on both of these records. These arrangements were both done by Gil Evans.
The problem here is how to define similarity.
In the third scenario, all of the features that matter are about how similar one person is to another. If you think someone’s taste in movies is trash, any proclamations from them about the Best Movie They’ve Ever Seen will likely fall on deaf ears.
This is a collaborative filtering problem, which is a fancy way of saying that it’s a similarity problem but with people involved.
Solving the ranking problem.
Ranking problems are super common. And one particular algorithm is responsible for a world-wide change in how people access information (PageRank).
It’s arguable that ranking, in and of itself, doesn’t need machine learning. And that’s probably true in most cases. But I know that I still get frustrated by the seemingly nonsensical ordering of star-ratings on a lot of websites. So maybe a more intelligent approach can be helpful.
Anyone who’s ever tried to build a ranking system knows that deciding on the right balance between, for example, average star ratings and the quantity of those ratings, is more difficult than it initially seems. And this is with only two factors! What do you do when you want to rank products by rating, purchases, favourites, views and maybe the semantics of its reviews?
The answer, of course, will depend on your specific needs, but I’ve had success using the IMDB weighted ranking formula. In it’s rawest form, it provides a nice balance between the average rating of an item and how many ratings it has. But, it can be extended or expanded to deal with the age of a rating (for example) and multiple weighted rankings can be combined using coefficient multipliers to account for how much you’d like to favour one metric over another.
Many clients of mine have been pleased with the results of implementing this kind of formula as IMDB have done a great job of coming up with an equation that ‘feels’ right when you see the final rankings.
Solving the similarity problem.
Just how similar are apples and oranges anyway? If all we had to go on was fruit, we’d probably say that they were pretty different (oranges are closer to lemons, limes and grapefruits while apples are closer to …. erm … pears?)
On the other hand, if we were comparing them to all foods, it’s pretty obvious they’re in the same ballpark; they’re both sweet, grow on trees etc.
What’s important here are the features of a thing and the domain that it’s in. But how do you quantify the various features of a product?
If you run an apparel store, you might think that one obvious feature of an item in your store is it’s category, for example whether it’s a hat or a t-shirt. This would certainly be the right kind of idea if you know that most customers are making purchase decisions by functionality rather than for more abstract fashion reasons.
And couldn’t you just suggest the top products (by sales, or margin) within those categories to the customer using the weighted ranking from above? Well, yes you can!
But what if the reason that a user clicked on that product was because of its colour swatch, its brand or the fact that it was in a sale? Or maybe a combination of all of these? Maybe it had an enticing written description that contained lots of keywords that were related the users initial search term.
What if the customer is really familiar with your store, has seen your top products over and over again, and even already owns a few of them?
It’s clear that in these cases, an ordered list based on weighted ratings wouldn’t get you optimal results. So what can you do instead?
Using feature engineering techniques like one-hot encoding, tf-idf, and word embeddings can help represent a product as a series of numbers (a vector) that you can use to compare one product to another mathematically.
Basically, the goal is to construct a dataset that has one row per product, and has columns that are numerical values that represent the various attributes of the product.
Once you have this, you can use the cosine distance to find the angle between two ‘products’ as if they were lines in space. Simply calculating this distance for each product with every other and storing the ranked positions in a table will mean that you can quickly serve excellent similarity recommendations to your customers.
(P.S. if you think that comparing the cosine distance between each product could take a long time, you’re right. But luckily for us the kernel methods package that’s part of the Scikit-Learn library contains a really efficient algorithm for getting it done quickly.)
Solving the collaborative filtering problem.
This part of the article is thankfully short, as we’ve already discussed all the methods needed to implement collaborative filtering!
There is some nuance though. But before getting to that, let’s review the steps necessary to implement a recommendation systems that match one user to another.
- Feature engineering — you know that customers behave differently from one another, and that they have different innate attributes (like age, location, taste and so on.) So now you just have to model each customer the way you did with products when you solved the similarity problem. Don’t worry if you don’t store information personal information about your customers, you can get creative and think of ways to combine their purchase history, their viewing history and any other information you have about what they do on your site into meaningful features.
- Cosine similarity — just like before, once every customer has been successfully abstracted into a vector of numbers, you can compute the cosine similarity for each of them with every other customer and store them somewhere.
Here’s where the nuance comes in.
It’s no good simply storing a list of similar users. You’re not trying to sell users, you’re trying to sell products.
Once you’ve found the most similar users, you should probably do something along the lines of; get the best matching users’ purchase history, remove any products that the target user (the person you’re upselling to) has already bought and then sort them somehow. (Somehow!? That was the first thing we did!)
And there we have the collaborative filtering problem solved!
Clever techniques for ranking and matching does not a machine learning system make.
What if products and users are similar or dissimilar in a way your very creative feature engineering process didn’t account for? What if you’re not capturing the really crucial information about your users that would let you neatly segment them to boost your sales?
There’s probably as many answers to this as there are people who’ve implemented recommender systems. And they’re probably all good answers. From recursive feature elimination to backtesting your solutions with different random subsets of features.
My personal favourite (and the simplest) is to keep track of how well each recommendation source performs and penalise it when it doesn’t convert the way you’d expect.
What this means is, if you’re initial feature engineering was no good, the recommendations will likely take some time to come up to scratch (though hopefully a lot quicker than if you started with random recommendations!)
However, if you’re head, heart and gut were all in the right place when you were designing your features, you’ll quickly get a system that knows what to recommend based on both the metrics that seemed important to you, a squishy human, augmented by real-life examples taken into account by the computer.
Both of these are great options if, right now, how you generate recommendation can best be described as borderline-suspect. However, if you need excellent recommendation and you need them quickly, it’s probably worth going back to the feature drawing board first.
Implementing a Recommender System
So, is that it? A few Python scripts, some tables in a database, and maybe some industrial-strength compute to compare each product with every other?
Well, yes and no. It’s true that you don’t need a lot to get started generating better recommendations, but there’s also versions of these systems that will keep data engineers busy for months.
If you do have millions of products and billions of customers, generating the cosine similarities will take a long time.
And even ranking all of your products across all of your important metrics isn’t the kind of thing that you can do on the fly.
Some thought has to go into the architecture of the these systems, how they interact, and how you actually serve recommendations to your customers so they aren’t stuck watching spinny-wheel when they’re desperately looking for similar products.
And the whole API-based system gets even more complex when you want to track recommendations and their conversions for penalising the poor performers and learning the good ones.
But these are all solvable problems.
Combine the techniques discussed in this article with good ETL practices, nightly re-training and some cascading (I suggest Collaborative Filtering -> Product Similarity ->Ranking) for when things go wrong, and you’re on your way to increasing sales, engagement time, views, reads, likes, clicks and whatever else matters to your business.