MRR vs MAP vs NDCG: Rank-Aware Evaluation Metrics And When To Use Them

Published in

The Startup

13 min readNov 25, 2019

Robert Delaunay, 1913, “Premier Disque”.

The ML Metrics Trap

Reporting small improvements on inadequate metrics is a well known Machine Learning trap. Understanding the pros and cons of machine learning (ML) metrics helps build personal credibility for ML practitioners. This is done to avoid the trap of prematurely proclaiming victory. Understanding metrics used for machine learning (ML) systems is important. ML practitioners invest signification budgets to move prototypes from research to production. The central goal is to extract value from prediction systems. Offline metrics are crucial indicators for promoting a new model to production.

In this post, we look at three ranking metrics. Ranking is a fundamental task. It appears in machine learning, recommendation systems, and information retrieval systems. I recently had the pleasure to finish an excellent recommender systems specialization: The University of Minnesota Recommendation System Specialization. This specialization is a 5 courses recsys quest that I recommend. I wanted to share how I learned to think about evaluating recommender systems. Especially when the task at hand is a ranking task.

Without too much loss of generality, most recommenders do two things. They either attempt to predict a rating of an item by a user, or generate a ranked…

MRR vs MAP vs NDCG: Rank-Aware Evaluation Metrics And When To Use Them

The ML Metrics Trap

Written by Moussa Taifi PhD