Machine Learning: Trying to make recommendations

Stacey Ronaghan
5 min readJul 31, 2018

This post is part of a series introducing Algorithm Explorer: a framework for exploring which data science methods relate to your business needs.

The introductory post “Machine Learning: Where to begin…” can be found here and Algorithm Explorer here

If you are looking to use machine learning to help make recommendations, you should look to Recommendation Engine Techniques.

Recommendation Engine Techniques

Recommendation Engines are created in order to predict a preference or rating that indicates a users interest in an item/product. The algorithms used to create this system find similarities between either the users, the items, or both.

Use-Cases

  • Recommend clothing to a customer based on brands, colors, and price of previously purchased clothing
  • Recommend a medical treatment for a patient based upon successful treatments given to similar patients using their condition, diagnosis and previous treatment information

Most Common Recommendation Engine Methods

Below are introductions on the most common methods for making recommendations: Content-based Recommenders, Memory-based Collaborative Filtering, and Model-based Collaborative Filter.

Content-based Recommenders

Content-based Recommenders suggest similar items to those already liked by the customer, whether explicitly (e.g. by rating) or implicitly (by purchasing). This type of system uses metadata describing the item.

Each item is represented as a vector and a distance metric compares the items’ vectors to find the most similar.

Pros

  • Quick to implement
  • No popularity bias — can recommend new items
  • Results are interpretable

Cons

  • Important to have meaningful metadata, tagging can be tiresome
  • Cold start problem” for new users without history of liking items

Vocabulary

Metadata — Data describing an item, its features. E.g. for a movie, the metadata is its genre, duration, actors, etc.

Vector — A feature vector is a series of numbers describing the observations characteristics, e.g. brand, words included in the description, price, etc.

Distance Metric — A distance metric is a function that calculates the distance between elements, examples include Euclidean distance and cosine distance.

Cold start problem — The term ‘cold start problem’ is coined from cars not running well when they’ve been left in the cold. When the recommendation engine doesn’t have sufficient data on the user, it doesn’t perform very well.

Memory-based Collaborative Filtering

Collaborative filtering is a method of predicting a user’s interest by analysing preferences by other users. There are two types, user-based filtering and item-based filtering.

Memory-based filtering computes similarity between users, or items, to make a prediction. A typical approach is a neighborhood-based algorithm; a similarity measure identifies the most similar users to the user, or items to the user’s already rated items. The predicted rating for an item can be calculated by pooling the collected ratings, possibly weighting each by it’s similarity value.

Pros

  • Quick to implement
  • Results are interpretable
  • User-based suggestions can result in a diverse set of suggestions across domains

Cons

  • Data sparsity can result in performance issues
  • Slow & computationally expensive — requires the whole dataset to make a prediction
  • Cold start problem” — new items struggle to be recommended (popularity bias) and for new users with little history it’s hard to make recommendations

Vocabulary

User-based filtering — User-based filtering recommends products to a user that similar users have liked.

Item-based filtering — Item-based filtering identify similar items based on those previously liked.

Neighborhood-based algorithm — A similarity measure identifies the most similar users to the user, or most similar items to the user’s already-rated items. They are referred to as ‘neighborhood’ as if you were to plot the data points, these would be the closest

Similarity measure — A function that quantifies the similarity between objects, e.g. cosine similarity

Pooling — This is a way to combine data and is usually done by taking the mean average.

Weighting — If you weight a value, you are assigning an adjustment to it based upon it’s important. When pooling, rather than take an average of the values, you can multiple each value by its proportional distance from the item of interest.

Cold start problem — The term ‘cold start problem’ is coined from cars not running well when they’ve been left in the cold. When the recommendation engine doesn’t have sufficient data on the user, it doesn’t perform very well.

Model-based Collaborative Filtering

Collaborative filtering is a method of predicting a user’s interest by analysing preferences by other users. There are two types, user-based filtering and item-based filtering.

Model-based filtering uses training data of users, items and ratings to build a predictive model. There are many algorithms to use, including neural networks, Bayesian networks and matrix factorization.

Pros

  • Fast & scalable; doesn't require the full dataset each time
  • User-based suggestions can result in a diverse set of suggestions across domains
  • User-based suggestions do not require metadata

Cons

  • Data sparsity can result in performance issues
  • Models can be complex and slow to train
  • Cold start problem” — new items struggle to be recommended (popularity bias) and for new users with little history it’s hard to make recommendations

Vocabulary

User-based filtering — User-based filtering recommends products to a user that similar users have liked.

Item-based filtering — Item-based filtering identify similar items based on those previously liked.

Model — Machine learning algorithms create a model after training, this is a mathematical function that can then be used to take a new observation and calculates an appropriate prediction.

Neural networks — Neural networks can learn complex patterns using “hidden layers” between inputs and the output. These layers are made of neurons which mathematically transform the data.

Bayesian networks — A Bayesian network is a graphical network where nodes are variables and edges are the conditional dependency between them.

Matrix Factorization — In the context of collaborative filtering, matrix factorization is trying to find a matrix for users and a matrix for items that when multiplied approximates the original rating table.

Further Reading

Other posts in this series:

Many Thanks

I wish to thank Sam Rose for his great front end development work (and patience!), converting my raw idea into something much more consumable, streamlined and aesthetically pleasing.

Similarly, my drawing skills leave much to be desired so thank you to Mary Kim for adding an artistic flare to this work!

--

--

Stacey Ronaghan

Data Scientist keen to share experiences & learnings from work & studies