Measuring the performance of recommendation system: Metrics [Practical Guide]— Part IV
Series Structure
In Part 3, I discussed developing a personalized Recommendation System with Collaborative Filtering for eLearning. In Part 4, I will focus on measuring the performance of the recommendation system we built. The structure of the series will be as follows:
Introduction
Now, that we have understood how we can build a recommendation engine, it is essential to measure the performance of the system. And the best way to measure performance is via metrics. Tracking metrics lets you improve overall results and align your people and processes with your organizational objectives.
There are various tech metrics you can measure to improve recall, accuracy, and precision of a recommendation system (click here for more on tech metrics); but these technical metrics maps to the business value of the systems.
In this blog, we will focus more on the business side of metrics. The sole purpose of building a recommendation system in this case study was to help students learn faster. Hence, we will be measuring these 2 core business metrics
- User goal: Improve Learning outcome- using user’s test scores
- Business goal: Improve student engagement time on the platform
Business Metrics
I have broken down the business metrics listed below into 3 categories: impact, interaction, and conversion.
Impact
Drawing an analogy from an e-commerce platform; we often see that there are multiple places where recommendations are shown in an e-commerce store like the landing page, product page, and checkout page, etc. The number of different places showing recommendations also affects their effectiveness. This is an important metric that can be linked with some other interaction (let’s say CTR) or conversion metric to derive some actionable insights like which location performs well and how many lists should be used for desired results.
For our case study, we will measure the visibility of recommendations being shown to the user on various pages.
Metrics:
- LOC — The number of locations where recommendation lists are being displayed
- REC — Total number of recommendations served
Interaction
Interaction metrics define the level of interest and engagement, users are showing in the recommended items. We can measure the interaction level with the number of clicks in the recommended items. And this number of clicks is used to calculate the popular metric CTR (Click Through Rate).
Metricss:
- CIR (Clicks in Recommendations) — The total number of clicks in the recommendations/recommended items
- CTR (Click Through Rate) — CIR/REC (Number of clicks in recommendations/total number of recommendations served)
Conversion
This is the most important and relevant group of metrics that measures the conversions of a recommendation engine and defines its true business value.
Metrics:
- ROR = items_rec / items (Number of activities consumed using recommendation suggestions/Total number of activities consumed)
- Ts_rec = Time spent while consuming recommended activities (articles, tests, notes, videos etc)/ Total time spent
- avg_test_score = average test score after completing activities from the recommendation system (trend line chart)
Increasing Revenue
Later, we built a course recommendation engine to sell personalized courses to the students. This became an important metric to track for us which tells how much revenue is increased by adding the recommendation engine in the store. This metric can be linked with the number of locations for displaying recommendations to derive the most efficient number of locations.
- IR = (REV_REC) / (REV — REV_REC)
((Revenue generated by recommended items) / (Total revenue of the store — Revenue generated by recommended items))
Technical Metrics
Prediction accuracy metrics (MAE, RMSE): the 2 most popular metrics in this group are MAE (mean absolute error) and RMSE (root mean squared error). The goal of these metrics is to measure how numerically close is your prediction from your real value. MAE punishes every error the same while RMSE punishes more larger errors. As you might have heard a lot about this metric, I won’t be covering it. All you have to know is lower these values, the better will be our model.
For more Technical metrics click here link1, link2
Offline & Online Experiments
There are various marketing-related experiments that have proven to be extremely helpful and are related to customer experience. I will not go deep into this but some experiments include A/B tests, focus groups, surveys, usage logs, etc.
I hope this was useful in helping you understand how to measure the performance of the recommendation system. With that note, we end this series.