Recommenders, the devil is in the details

The devil is in the details. It’s an expression that is often used to say that some things need to be done thoroughly, that details can be important. This is of course true for many things that people do everyday, but it can particularly be the case in the fields of data science and advanced analytics.

Now when it comes to data science I am the first one to admit that I am a stickler for the nitty-gritty, that I need to thoroughly understand every model or algorithm that I work with. But I have noticed that I am more exception than rule in this. That data science often has a large gap between theory and practice.

Compare it to automobiles if you will. Lots of people know how to drive a car, but few people know exactly what goes on under the hood. Data science is moving in this same direction, with many models, methods and algorithms packaged and ready to use for anyone who can handle the proverbial steering wheel. But how many users actually know what happens under the hood of their freshly obtained machine learning algorithm.

Earlier this year I supervised a Master’s student, Sardana Nazarova, a clever young computational scientist who I knew from my lectures at the University of Amsterdam. She had been tasked with taking a very close look under the hood of a recommender system. (See her Master thesis here.)

A recommender system is a machine learning algorithm that in essence recommends items to users. For example, to recommend movies to Netflix subscribers, or articles to visitors of an online store. If this is what you are looking for, perhaps to increase sales, or do cross-selling, then a recommender system may look like a solution. And it very well may be, but the devil is in the details.

I encountered the recommender when I just joined ING Wholesale Banking, where it was being tried for cross-selling, as a way to identify additional financial markets products that a client could be interested in. Think of options, bonds, commodities and the like. The recommender’s job was to look at how much clients had used certain products in the past, and then predict how much they would use other products, if these were recommended to them. The particular type of recommender was based on matrix factorization, a method made popular by its contribution to winning the Netflix Grand Prize, and so easily seen as a solution to our recommendation problem.

Matrix Factorization example

However, testing the recommender turned out to be rather hard. Our data were sparse, hard to interpret, and above all it was difficult to judge recommendations if you hadn’t actually, actively recommended any products to clients. But instead of setting up a laborious trial period, with little reason to expect success, we took a different approach. Sardana was to study the dirty details of our particular recommender system, and find out whether or not it could be used for financial markets products.

The result was unfortunately negative for us, we couldn’t use this type of recommender for financial markets products. The reason lies in how this recommender works, and how market trading seemingly does not.

The recommender, like any model, method, or algorithm, is based on a premise. On a mathematical principle that makes it possible to do what it does. But that at the same time limits what it can be used for.

Compare it to automobiles again. A car’s engine is based on internal combustion and can be used to make things move. It doesn’t lend itself well to microwaving food, for example. This needs a very different type of engine, a magnetron. For our recommender the principle is matrix factorisation, and this limits its usability to clients and products that are in some way “factorizible”.

Factorizability can be understood as clients always using products in the same ratio. For example, client number one always using products twice as much as client number two. Or to do a bit more justice to the recommender, client number one always using products twice as much as client number two in some situations, three times as much in other situations, and perhaps four times as much in yet another.

These situations are latent factors in the recommender. If you are unfamiliar with latent variables, they are a data scientist’s way of dealing with the unknown. Basically saying that we don’t know what they are, but if we include them, then hopefully the algorithm can fill them in. But what Sardana’s research showed in a nutshell, is that we need more latent factors than there are data available, that there are seemingly more of these latent situations in market trading than the recommender can fill in from our data.

Normalised residuals deviation on sparsity. For high sparsity the model fails to find the unknown ratings

It thus appears that market trading behaviour is more volatile than our particular recommender system could deal with. And if we are to pursue solving the financial markets recommendation problem, then we need a different engine, an engine with different details than the one we have.

But don’t rest easy if you are not in the business of market trading. The problem may arise for any dataset, whether you are recommending bonds, loans, movies, articles, or hotels with this recommender. The more volatile your users happen to behave, the more data you need to recognise this.

There is a lesson to be learned here, even for the seasoned data scientist.
Data science is a complicated business, and not only because it requires advanced knowledge of statistics and programming, as is so often mentioned, but because machine learning and advanced algorithms are intricate mathematical machines, that take skill, effort and a love for the nitty-gritty to truly understand.

You cannot take every popular machine learning or advanced algorithm at face value. Or you may not get the right engine to power your solution.

Fabian Jansen 
Senior data scientist
ING Wholesale Banking Advanced Analytics