Measuring the performance of recommendation system: Metrics [Practical Guide]— Part IV

gaurav.product
product.design.data
4 min readJan 15, 2021

Series Structure

In Part 3, I discussed developing a personalized Recommendation System with Collaborative Filtering for eLearning. In Part 4, I will focus on measuring the performance of the recommendation system we built. The structure of the series will be as follows:

Introduction

Now, that we have understood how we can build a recommendation engine, it is essential to measure the performance of the system. And the best way to measure performance is via metrics. Tracking metrics lets you improve overall results and align your people and processes with your organizational objectives.

There are various tech metrics you can measure to improve recall, accuracy, and precision of a recommendation system (click here for more on tech metrics); but these technical metrics maps to the business value of the systems.

In this blog, we will focus more on the business side of metrics. The sole purpose of building a recommendation system in this case study was to help students learn faster. Hence, we will be measuring these 2 core business metrics

  • User goal: Improve Learning outcome- using user’s test scores
  • Business goal: Improve student engagement time on the platform

Business Metrics

I have broken down the business metrics listed below into 3 categories: impact, interaction, and conversion.

Impact

Drawing an analogy from an e-commerce platform; we often see that there are multiple places where recommendations are shown in an e-commerce store like the landing page, product page, and checkout page, etc. The number of different places showing recommendations also affects their effectiveness. This is an important metric that can be linked with some other interaction (let’s say CTR) or conversion metric to derive some actionable insights like which location performs well and how many lists should be used for desired results.

For our case study, we will measure the visibility of recommendations being shown to the user on various pages.

Metrics:

  1. LOC — The number of locations where recommendation lists are being displayed
  2. REC — Total number of recommendations served

Interaction

Interaction metrics define the level of interest and engagement, users are showing in the recommended items. We can measure the interaction level with the number of clicks in the recommended items. And this number of clicks is used to calculate the popular metric CTR (Click Through Rate).

Metricss:

  1. CIR (Clicks in Recommendations) — The total number of clicks in the recommendations/recommended items
  2. CTR (Click Through Rate) — CIR/REC (Number of clicks in recommendations/total number of recommendations served)

Conversion

This is the most important and relevant group of metrics that measures the conversions of a recommendation engine and defines its true business value.

Metrics:

  1. ROR = items_rec / items (Number of activities consumed using recommendation suggestions/Total number of activities consumed)
  2. Ts_rec = Time spent while consuming recommended activities (articles, tests, notes, videos etc)/ Total time spent
  3. avg_test_score = average test score after completing activities from the recommendation system (trend line chart)

Increasing Revenue

Later, we built a course recommendation engine to sell personalized courses to the students. This became an important metric to track for us which tells how much revenue is increased by adding the recommendation engine in the store. This metric can be linked with the number of locations for displaying recommendations to derive the most efficient number of locations.

  • IR = (REV_REC) / (REV — REV_REC)
    ((Revenue generated by recommended items) / (Total revenue of the store — Revenue generated by recommended items))

Technical Metrics

Prediction accuracy metrics (MAE, RMSE): the 2 most popular metrics in this group are MAE (mean absolute error) and RMSE (root mean squared error). The goal of these metrics is to measure how numerically close is your prediction from your real value. MAE punishes every error the same while RMSE punishes more larger errors. As you might have heard a lot about this metric, I won’t be covering it. All you have to know is lower these values, the better will be our model.

For more Technical metrics click here link1, link2

Offline & Online Experiments

There are various marketing-related experiments that have proven to be extremely helpful and are related to customer experience. I will not go deep into this but some experiments include A/B tests, focus groups, surveys, usage logs, etc.

You can find more details over here link1, link2

I hope this was useful in helping you understand how to measure the performance of the recommendation system. With that note, we end this series.

If you learned something new or enjoyed reading this article, please clap it up 👏 and share it so that others will see it. Feel free to leave a comment too.

--

--