Intuitive explanation of Learning to Rank (and RankNet, LambdaRank and LambdaMART)
RankNet, LambdaRank and LambdaMART are all what we call Learning to Rank algorithms.
What is Learning to Rank?
Learning to Rank (LTR) is a class of techniques that apply supervised machine learning (ML) to solve ranking problems. The main difference between LTR and traditional supervised ML is this:
- Traditional ML solves a prediction problem (classification or regression) on a single instance at a time. E.g. if you are doing spam detection on email, you will look at all the features associated with that email and classify it as spam or not. The aim of traditional ML is to come up with a class (spam or no-spam) or a single numerical score for that instance.
- LTR solves a ranking problem on a list of items. The aim of LTR is to come up with optimal ordering of those items. As such, LTR doesn’t care much about the exact score that each item gets, but cares more about the relative ordering among all the items.
The most common application of LTR is search engine ranking, but it’s useful anywhere you need to produce a ranked list of items.
The training data for a LTR model consists of a list of items and a “ground truth” score for each of those items. For search engine ranking, this translates to a list of results for a query and a relevance rating for each of those results with respect to the query. The most common way used by major search engines to generate these relevance ratings is to ask human raters to rate results for a set of queries. In case you are interested, I have written in detail on human rating systems here: Nikhil Dandekar’s answer to How does Google measure the quality of their search results?
For a more technical explanation of Learning to Rank check this paper by Microsoft Research: A Short Introduction to Learning to Rank
What is RankNet, LambdaRank and LambdaMART?
RankNet, LambdaRank and LambdaMART are all LTR algorithms developed by Chris Burges and his colleagues at Microsoft Research. RankNet was the first one to be developed, followed by LambdaRank and then LambdaMART.
In all three techniques, ranking is transformed into a pairwise classification or regression problem. That means you look at pairs of items at a time, come up with the optimal ordering for that pair of items, and then use it to come up with the final ranking for all the results.
Here are some high-level details for each of the algorithms:
RankNet was originally developed using neural nets, but the underlying model can be different and is not constrained to just neural nets. The cost function for RankNet aims to minimize the number of inversions in ranking. Here an inversion means an incorrect order among a pair of results, i.e. when we rank a lower rated result above a higher rated result in a ranked list. RankNet optimizes the cost function using Stochastic Gradient Descent.
Burgess et. al. found that during RankNet training procedure, you don’t need the costs, only need the gradients (λ) of the cost with respect to the model score. You can think of these gradients as little arrows attached to each document in the ranked list, indicating the direction we’d like those documents to move.
Further they found that scaling the gradients by the change in NDCG found by swapping each pair of documents gave good results. The core idea of LambdaRank is to use this new cost function for training a RankNet. On experimental datasets, this shows both speed and accuracy improvements over the original RankNet.
LambdaMART combines LambdaRank and MART (Multiple Additive Regression Trees). While MART uses gradient boosted decision trees for prediction tasks, LambdaMART uses gradient boosted decision trees using a cost function derived from LambdaRank for solving a ranking task. On experimental datasets, LambdaMART has shown better results than LambdaRank and the original RankNet.
If you are interested, Chris Burges has a single paper that details the evolution from RankNet to LambdaRank to LambdaMART here: From RankNet to LambdaRank to LambdaMART: An Overview
(Answered originally at Quora: What is the intuitive explanation of RankNet, LambdaRank and LambdaMART?)