Paper learning(1) — Addressing Cold Start in Recommender Systems: A Semi-supervised Co-training Algorithm
1.Background
To get better performance of cold start problem in recommender systems, the paper proposals a context-aware semi-supervised co-training method named CSEL.
Tasks for the recommendation engines:
(1)understand users’ personalized preferences from their historic rating behaviors
(2)how to improve the recommendation accuracy for the new items and the new users
Defects of other algorithm
Schein proposed combining content and collaborative data under a single probabilistic framework.However, it’s based on Bayes classifier and cannot accurately model the users’ fine-grained preference
Questions
(1)How to build models using the information of different sources?
(2)How the different models can help each other?
Solutions and Contributions
(1)a fine-grained context-aware approach which incorporates additional sources of information about the users and items rather than users’ rating information only
(2)a semi-supervised co-training method to build the recommendation model
2.Context-aware Factorization
(1)model-based collaborative filtering(CF) algorithm. The recommendation error increases quickly when directly applying the standard approach to unpopular items.

(2)enhance the model by letting item biases share components for items linked by the genres

(3)incorporate the context(attributes) of users such as age and gender into the model.

(4)mixing the user and item’s contexts to obtain a further optimization

Model learning
define an objective function to learn the context model by minimizing the prediction errors over all examples in L

3.SEMI-SUPERVISED CO-TRAINING
CSEL consists of three major steps
(1)Constructing multiple regressors
(2)Co-Training
(3)Assembling the results

3.1 Constructing Multiple Regressors
(1)by Manipulating the Training Set
The basic leaner for generating the regressors could be standard factorization model, such as

(2)by Manipulating the Attributes
Another way is to divide attributes into multiple regressors, such as

(3)by a Hybrid Method
3.2 Semi-supervised Co-training
(1) Confidence for the FactCF Model

(2)Confidence for the Context-aware Model

Constructing and Co-training with the Teaching Set

3.3 Assembling the Results
(1) a straight forward method is to take the average, i.e.,

(2)Considering the confidence of different regressors, assemble the results by a weighted vote of the individual regressors, i.e.,

4. Thoughts
(1) co-training
Co-training is a semi-supervised learning technique that requires two views of the data. It assumes that each example is described using two different feature sets that provide different, complementary information about the instance. Ideally, the two views are conditionally independent (i.e., the two feature sets of each instance are conditionally independent given the class) and each view is sufficient (i.e., the class of an instance can be accurately predicted from each view alone). Co-training first learns a separate classifier for each view using any labeled examples. The most confident predictions of each classifier on the unlabeled data are then used to iteratively construct additional labeled training data

(2)collaborative filtering(CF) has two kinds, memory-based and model-based. Memory-based model includes user-based recommendation and item-based recommendation. Model-based includes latent factor model-matrix factorization that could mix the user/item features and social relationship.
