Final Week — MOOC Recommendation

Published in

AIN311 Fall 2022 Projects

4 min readJan 1, 2023

Hello everyone. Welcome to the last blog about our course project. Last week, we preprocessed the data for comment analysis to make ratings out of them. This week, We finalized everything we have done so far and combined collaborative filtering and comment analysis.

To decide our final model for collaborative filtering, we tested different algorithms to see which was the best performing. These methods are the collection of techniques that is available on Surprise Scikit. Some of them are memory-based and some of them are model-based.

Here are the root-mean-squared-error values of the different methods we tested:

The best-performing ones are BaselineOnly, KNNWithBaseline, and SVD algorithms. We tried to fine-tune these three to see if we can improve their performance further. After parameter searching with GridSearchCV, the results showed us that we could not improve performances with different parameters. So we chose SVD as our final model.

The previous week, we preprocessed the data for the modeling. There are a few steps before the data is ready. Let’s deep dive in.

We decided to use sentiment analysis scores to predict the rating. So we used the vaderSentiment library to analyze the comments. SentimentIntensityAnalyzer function analyses the comments and returns a dict that contains scores such as:

Negative(neg): Negativity value of the comment
Neutral(neu): Neutrality value of the comment
Positive(pos): Positivity value of the comment
Compound(compound): Compound score of the comment

After the process, the data looked like this:

First 5 comments with sentiment scores and rates

Since all of the features turned into numeric, our solution must be a regression model. At first, we decided to use 4 different algorithms and select the best one by considering their RMSE scores. Also, we tried some variations such as normalization and standardization of the data, and evaluate the effects of these operations.

Performances for each algorithm and scaling

As you may guess, we chose the SVR with no scaling. We initialized some values for three hyperparameters of SVR in a list. By using Grid-Search CV, optimal parameters are determined. After that, we fitted the new SVR model with tuned hyperparameters and the result was not impressive. RMSE score decreased by only 0,001.

After the disappointment with the result, we thought some things are wrong with the data. So we considered the sub-intervals separately and evaluate how the model predicts each sub-interval.

Average ratings of each sub-interval for actual and predicted values

In 0–1 interval, the predictions are higher than the actual
In 4–5 interval, the predictions are lower than the actual
For other intervals, there are not so many differences

Why our model is unsuccessful while predicting the 0–1 and 4–5 cases? We think users can be generous or ungenerous. In their comments, their behaviors imply that they are neutral but they can give a high or low rating. This situation affects the results significantly. So let’s detect these inconsistencies.

Comments rated less than 3 with higher positive scores than negative scores must be considered an inconsistency
Comments rated higher than 3 with higher negative scores than positive scores must be considered as inconsistency as well

After eliminating the inconsistent comments, we have seen that approximately 26% of the comments are inconsistent. 94% of the inconsistencies came from positive comments rated very low. RMSE score without inconsistent comments is 0.853. This means further analysis decreased the RMSE score to 0.235. This improvement is quite high if we compare it with 0.001.

On our dataset, we applied comment analysis methods to get ratings different than the true ratings given by the user. With these ratings, we fitted our model to see if the results improved or worsened.

With actual ratings, the RMSE score was ~0.70
With predicted ratings, we obtained the RMSE as ~0.95

With the predicted ratings, the obtained results were worse than the true ratings. But it is still capable of making recommendations.

Finally, we built two demos to show example recommendations and comment analyses.

With this, we have come to an end for both our blog posts and our course project. We have learned many different things during this period. Thank you for your interest in our project journey and we wish you pleasant days.

Contributors:

Arif Enes Aydın
Muhammet Ali Şentürk

Final Week — MOOC Recommendation

Written by Arif Enes Aydın