Recommender systems behind the scenes

Published in

ARC Software

5 min readJun 10, 2018

“Stacks of vinyl records at a record store in Boston” by Valentino Funghi on Unsplash

I love new tasks. This time it was about building a recommendation system for project-user matching purposes.

I have never worked with this kind of problems before. After a few months delving into many articles or scientific papers, I knew that it won’t be that easy.

This post will mention a few issues you should also encounter when doing similar stuff. Prepare for them upfront and save lot’s of time later on 🤠.

I assume that you have some basic topic knowledge. If not, there are plenty of great hands-on resources. You can come up with basic understanding and working applications in a few hours. Do that first.

What are you optimizing for?

Look at Netflix.

Netflix Recommendations: Beyond the 5 stars (Part 1)

One of the most valued Netflix assets is our recommendation system

medium.com

In Netflix, recommendations are almost everywhere. The whole product is build with them in mind.

Personalization starts from the homepage. Each user has its own set of rows, representing different genres. Inside each row, there are also personalized and sorted sets of movies.

But how do they know what to show? There is a deeper question — what to optimize for?

Maybe it’s not all about best cashflow revenue or greatest offline accuracy?

All choices they make must improve the overall member’s experience. The idea is that if he will feel good when using the product the monetization will follow.

Consider:

How would users like to interact with the system? What are possible workflows? What are the possible places to personalize experience?
Which ranking method to use? What kind of results will our users favour (diversity, freshness, similarity, …)?

Which method to use?

“Go with collaborative filtering”, they said.

But unless you don’t have gathered big pile data this approach probably won’t work. You will waste time at the beginning.

Pros/cons of some other approaches:

Non-personalized (NP). Simple — use some statistics across all the data (lack of deep personalization). Services like Reddit or Digg have implemented this model. You can also do some other things. Features like market basket analysis (“Customer also bought” feature) are fairly straightforward.
Knowledge-based (KB). Extend previous approach with items description (metadata). Then query the database for most relevant ones. This will introduce some level of personalization. For example pre-filter data using information decoded from user’s IP address. When dealing with rare or high value items (like luxury vehicles, real-estate, …) often it is impossible to gather enough interaction data. Then KB might be the only possible approach. And you can also build query in myriad ways. Take a look at MyProductAdvisor. They can help you to buy best car for yourself by using a clever interview based UI. Also simple one to get started.
Content-based (CB). Here we also add another signal — user ranking. By knowing his preferences (explicit or implicit) we build distinct user-profile. Items recommendations are delivered based on chosen similarity metric. To do so every item is compared to the user profile. There is also a huge problem with this — suggestions lack of novelty. All results are somehow similar to those previously interacted with. Also some clever way of calculating user-profile vectors might be necessary.
Collaborative-filtering (CF). Most popular approach, that does not take into account items features at all. Uses only the user interactions historical data. With huge amounts of it, expect to get novel and surprising results.
Maybe hybrid? LightFM library tries to integrate the features of CB and CF. Worth checking.

But it’s also related to the issue of …

Cold start

Let’s say you have built and validated the model offline. It works great so … let’s hit production.

It probably won’t start. Image from https://www.youtube.com/watch?v=4wcAJH0XAz8

But what happens when a new user registers? High odds, you know nothing about him. The CB and CF methods won’t work unless he provides some data about himself. This can happen either explicitly or after some interaction with the system. Without some prior knowledge, the “personalized output” from our fancy system might look, well …. weird.

Consider:

Coming up with some mixed architecture. Serve items coming from the non-personalized model for unknown or unsure users. A popup might ask them for an explicit opinion from time to time. At a convenient time (with enough data) there will be a graceful switch. The model will be rebuilding using all possible data serve users new items.
Rebuild the on-boarding process. Get some baseline data as soon as possible. Look how Twitter handles in the last step of registration.

Presenting the outcome

This might seem pretty simple. Take the direct output from the model and present top-N items.

But there are also some caveats here:

You need some post-processing! The ability to inject some custom object will probably be crucial. Sponsored content? Favour some niche? Get out of local minima (more on that later)?
Psychological aspect. Users might feel concerned when seeing recommendations, not knowing how they were produced. Educate them what data is being collected and how it’s used. Provide them with the explanations of certain picks (we are suggesting you X because you liked Y). Keep them calm and safe respecting privacy issues.

Deployment

It’s easy to build an offline PoC. But going online also isn’t that trivial:

Fast answers — think about communication latency. It might be beneficial to put the algorithm as close to the data source as possible. This will be particularly important for dynamic recommendations. There calculations should happen almost instantly. You can use some databases native features. Some of them enables writing custom extensions. For example in Postgres you can write own high-efficient C code. Other issues like data serialization or deserialization also might have significant impact.
Retraining — after presenting items for recommendation the dataset is somehow biased. Be very careful when retraining your model. The error coming from the feedback loop might get wrong exponentially. A possible solution might be to label everything that was presented to the user. These examples won’t take part in further training.
Monitoring performance — how do you know if your system is doing good? Be aware of concepts drifts. Seasonal changes or trends might also have a negative impact on the performance. There might be malicious users trying to alter your model with specific interaction. To react instantly think about some real-time dashboard presenting you the current state. Capabilities like alerting or instant algorithm roll-backs might also be useful.

In summary, there are lots of nuances to consider. Each topic deserves special own treatment and dedicated approach. Most of them are after all business decisions. People should be aware of the possibilities and limitations of each choice.

If you have some questions or need further help feel free to get in touch.