Build a Recommendation System with Collaborative Filtering for eLearning [Practical Guide] — Part III

Published in

product.design.data

7 min readJan 16, 2021

Series Structure

In Part 2, I provided the building blocks required to build a recommendation system. In Part 3, I will be taking a deeper dive into developing a personalized Recommendation System with Collaborative Filtering for eLearning. It uses some of the concepts discussed in the 2nd part of the series, so please check it out if you want to get more details. The structure of the series will be as follows:

3. Recommendation System

Fig. 1: Functional Model of Personalized Recommendation System

The proposed recommender model, as shown in figure 1, is based on two approaches:

Classifying a new learner using an initial test [remove cold-start problem]
Recommendations using leaner’s activities [increase engagement]

Classifying a new learner using an initial test

As user register in the system, the system requests the learner to fill the registration form and to take the initial level test in order to build learner profile based on learning activities. It also helps solve ‘cold-start’ problem for a new user (to read more about this, click here)
After completing the initial test, the results is used to classify the students’ in homogenous subclasses according to their knowledge level and then saved in the learner model. Post this, the student can start learning.

Recommendations using learner’s activities

The recommender module generates suitable recommendations to learners based on learning preferences and activities.
The learner model is revisited dynamically using the student’s interactions with the system by extracting user interests from log files in order to revisited students’ current preferences, and produce a recommendations list most suitable.
The data mining techniques utilize the collected information about learner’s interactions with various entities such as lecture notes, videos, activities, presentations, questions, etc., to build learner’s profile and to produce recommendations.

Students’ classification Algorithm

The algorithm for student’s classification is based on educational data mining to predict the homogeneous sub-classes of students according to their previous results in several assessments that are designed in a relevant and simple educational approach. Figure 2 below shows the detailed schema to illustrate our implemented solution to perform the student classifier based on decision trees.

Fig. 2: schema of student classification algorithm based on decision trees

Recommendation Model

The recommender module helps to decide whether a given learning scenario is suitable for specific learner preferences or not. This module utilizes collaborative filtering to classify a learning strategy as “suitable” or “not suitable” for the learner. The learning scenario is achieved by the four steps.

Step 1: Data cleaning & Data processing

The First step of the recommendation system is “Data Cleaning and processing”. Data preparation is an important issue for all methods utilized in data mining, as real-world data tends to be missing or containing errors, or outlier values that deviate from the expected data.

Step 2: Normalization

The second step is the “Normalization” in which the data is transformed or combined into forms appropriate for mining. The learning object (LO) recommendation sequence is based on the learner's rating.

LO’s sequence takes into consideration the evaluation of the content, the number of stars voted for this content, learner reputation, and the number of likes and dislikes in order to evaluate the content.

Equation 1: Evaluation of Learning Object

Where Σ(𝐿) represents, the total number of evaluations of a learner and Σ(𝑂)represents the total number of evaluations of the contents of this learner. After weighting learning resources, the reference model for each learner is defined as a Learner-Learning Object Rating matrix with N rows in which N denotes the number of learners L={l1, l2,….. ln} and M columns denote the number of learning objects O={O1, O2,…., Om}. This matrix uses a 0-to-5 rating scale where: 5 means that the learner is strongly satisfied with the selected learning object, 1 means that the learner is not at all satisfied with the learner object, and finally, the score of 0 indicates that the learning object is not yet explicitly rated or used at all.

Step 3: Similarity computation

The third step is the “Similarity computation”: Once the learner’s model is identified, we apply the method-based collaborative filtering in order to create virtual communities of interest. This step is carried out by improving the most known classifier algorithm K-Nearest-Neighborhood (K-NN) in several domains. The critical step in collaborative filtering algorithms is the similarity computation between users or items. There are various approaches to calculate the similarity, the most commonly employed measurement of similarities is Cosine Similarity. The similarity between two learners’ x and y with Cosine similarity is calculated as follows:

Equation 2: Cosine Similarity between two learners x and y

In the above equation: Rx,j and Ry,j are learner x’s ratings and learner y’s ratings for the learning object.

If the learner x and y have a similar rating for a learning object, w(x, y) > 0. |w(x, y)| indicates how much learner x tends to match with learner y on the learning object that both learners have previously rated. If they have different ratings for learning object w(x, y) < 0. |w(x, y)| Indicates how much they tend to disagree on the learning object that both again have already rated. Hence, if they don’t agree with each other, w(x, y) can be between -1 and 1.

After calculating the similarity between learners, an NxN similarity matrix is generated, where n is the number of learners. Then, to predict the unrated learning object j in the rating matrix by the active learner x, the K most similar learners which have the highest similarities with the current learner will be selected and use as the input to compute the prediction for x on j.

Step 4: Recommendation

The last step is the “Recommendation” In this step we compute prediction for each learning object unselected by the target learner. Finally, the learning objects with high ratings are used to compute learning resources in descending order. To make a prediction for the active learner x on certain learning object j, we can take a weighted average of all the ratings on those learning objects according to the following formula:

Equation 3: Weighted average of all ratings of active learner (x) on learning objects

In equation (3), Ry,j denotes the rating for the learning object j by user y.

Implementation

We implemented the proposed solution in an e-Learning personalization system, which takes the learner’s learning activities into account and applies content-based filtering, collaborative filtering, and educational data mining methods for recommendations. Here, we try to defeat the cold-start problem by introducing the initial level test to define the initial profile of the new learner. In this project, the system evaluates the learner’s level of knowledge, learner’s learning activities, and learner’s performances. Then, the system presents the recommendation list according to the results of the learner’s evaluation and profile.

Results

We implemented the recommendation system on a focus group of 10,000 students to train the model and observed the performance over a period of 4 weeks; before rolling it out to the complete user base.

Metric used to enhance recommendation engine performance — Improvement in user’s test scores over time

We observed the average test score for users after completing activities (LOs) from the recommendation system compared with students who completed the same activities (LOs) without recommendation and plotted the trend line chart. We noticed an increase of 16% in test scores.

Future Work

We also measured a few other technical metrics (in Part IV of the series) to improve prediction accuracy. The next focus area of my work will be on incorporating test score results (metadata) to suggest lecture notes, videos, examples, exercises, etc basis learner’s strengths & weaknesses for a concept.

I hope this case study was useful in helping you understand how to build collaborative filtering for a personalized recommendation. Now continue on to Part IV: Measuring the performance of recommendation system: Metrics

If you learned something new or enjoyed reading this article, please clap it up 👏 and share it so that others will see it. Feel free to leave a comment too.