Recommendation System for E-commerce customers — Part III (Hybrid Recommendation)

Raymond Kwok
Analytics Vidhya
Published in
6 min readJul 11, 2019

Explained with codes

Part I: Content-based

Part II: Collaborative

Part III: Hybrid

This is the third article of this recommender series, focusing on the hybrid approach. For introduction and data used in the series, please refer to the first article.

Hybrid approach

In the content-based approach in the first article, each recommended item is described with our knowledge about them, such as for a movie item the descriptions should include its director, cast, genres, language, runtime, professional reviews, use of effects such as computer graphics and so on. Then based on what a customer likes in the history, some new but similar items could be recommended. In collaborative filtering for the second article, there is no explicit description about an item, but to make use of interaction history between items and customers to find out behavioral patterns among customers, so that a customer could be suggested with items that similar customers like.

There is no standard rule for building a hybrid system, and I think you can use this term when combining the best of different approaches cleverly. This is the same here, and that my attempt is to use categories embedding got in the content-based approach, together with the customer embedding in the collaborative filtering approach. They will be used as inputs to a new neural network and see if this makes a difference.

The trained model could be used in two different modes —

(1)use it to re-order the recommendations ranked by the previous approaches, and

(2) use it to make its own recommendations. We will compare how well all these approaches work with a plot at the end.

Another NN to make use of previously prepared embeddings

Let us start with codes, which is in fact a simple sequential 5-layer NN.

from keras.layers import Input, Dense
from keras.models import Model
# nfeats: int, number of features. In our case, this is the size of a user embedding, plus the two times the size of a category's embeddinginputlayer = Input(name='input', shape=[nfeats])
denselayer = Dense(int(nfeats*1.5), activation='relu')(inputlayer)
denselayer = Dense(nfeats//1, activation='relu')(denselayer)
denselayer = Dense(nfeats//4, activation='relu')(denselayer)
denselayer = Dense(nfeats//16, activation='relu')(denselayer)
outputlayer = Dense(1, activation='sigmoid')(denselayer)
model = Model(inputs=[inputlayer], outputs=outputlayer)

The architecture of this NN could be tuned for the best performance, and what worth discussing is the inputs. The customer event record is roughly 3 months long, and it is divided into a few periods, with 18 days long each. There is overlapping period for expanding the input data size.

As a reminder, in each customer record there is a customer ID, and the category ID of the item he interacted (viewed/added-to-cart/purchased) with.

In each period, for every customer, (1) the average of all category embeddings interacted is obtained, together with (2) the customer’s embedding, and (3) the embedding of a category that he interacted in the next period.

a 5-layer NN. The input consists of three parts: (1) blue: a customer’s embedding (2) green: the averaged category embedding in the current period (3) red: a category embedding in the next period. The red boxes could be substituted with different categories during predictions to test how likely a customer is to interact with it in the future.

The choice of inputs matters a lot, and if a more comprehensive input could be provided to the NN, the result could be even better! What is being presented here is certainly just one way to make it.

This NN tells us given a customer’s properties (the blue box) and his record (the green box), how likely he is to interact with the category in the red box. During training, different category could be put in the red box, with the expected output label between 0 and 1 depending on whether or not, or how many times, he interacted with it in the next period. Both positive ( label >0 ) and negative ( label=0 ) samples should be provided in the training process.

Training curve measured in accuracy. Purple line is the actual training curve of the model, with an increasing trend over epochs. Red line is to monitor the accuracy to only the positive samples. It fluctuates a lot by the adding of new negative samples during the training.

Comparison of all approaches

Let us name the NN above as the Hybrid model. The content-based (red line in below plot), the collaborative filtering (purple line), and this hybrid model (green line) could each produce its own recommendations to customers. Moreover, the recommended categories by the content-based and collaborative filtering could be put into the hybrid model for re-ranking (dashed red and dashed purple lines). All these approaches’ results are examined by plotting the percentage of customers getting a match WITHOUT even really sending them the recommendations. The x-axis refers to the size of the recommendations and it is natural that as you make more recommendations it is likely to get a match.

Percentage of customers with a matched recommendation in the next period WITHOUT notifying them the recommendations — essentially it is about the precision of the recommendations.

Although it is seen that the content-based (red line) performs the worst as it is always lower than all the others, it is not the problem of the approach, but it could be that the descriptions for the category are simply not good enough to highlight the common characters of categories (to increase the true positives) or to make them well distinguishable (to reduce the false positives). Additional features of the categories may be helpful.

Because of the lack of a good description, the collaborative (purple line) always does better by ~15% more, showing that interactions history between customers and categories itself has a good prediction power.

The hybrid (green line) model beats the collaborative (purple line) model over a critical point, which is an encouraging result that we could switch over between these two models in different cases to hit the best match!

The dashed lines are the reordering of recommendations from the content-based and the collaborative approach by the use of the hybrid model. Improvement could be seen in the content-based case as the hybrid model is better, but the opposite occurs to the collaborative case. Therefore, if we want to make a safer (which is the nature of the content-based approach), but better recommendations to the customers, we could go for the hybrid + content-based mode.

Lastly, the customers were not notified of the recommendations in this result, so it could have done better if otherwise. It should not be forgotten that a recommendation is not only about making a precise prediction (which, of course, is very important too, because disturbing customers with very bad recommendations is unwanted), but in my opinion it is also about letting customer discover.

Summary

Three ways of making recommendations are presented and compared. It is important to note that they are only different uses of the data we know about the customers, and they should not be the only ways.

Sometimes the approach is limited by the degree of understanding about the customers and the recommended items, sometimes it is the size of the data and computation resources, and sometimes it is limited by the business case or the expectations of the level of conservativeness, the right approach to use is one that do a better job given all these constraints.

These words sounds too common-sense, but it is a good reminder that this is when creativity needs to come into play to make things happen well, so please never be bounded.

Thank you for reading the article and for following the series. Please feel free to share your comments and feedback.

--

--