Recommender system on the Movielens dataset using an Autoencoder and Tensorflow in Python

Soumya Ghosh
4 min readMar 17, 2018

--

I’m a huge fan of autoencoders. They have a ton of uses. They can be used for dimensionality reduction like I show here, they can be used for image denoising like I show in this tutorial and a lot of other stuff.

Today I’ll use it to build a recommender system using the movielens 1 million dataset. You can download it yourself from here. I was mostly inspired by this research paper to build this model. First let me show you what the neural net model will look like. I took this pic straight out of the research paper.

This is a shallow neural net with only one hidden layer. So I’ll just feed in all the movie ratings watched by a user and expect a more generalized rating distribution per user to come out. I can use that to get an idea of what their ratings would be for movies they havn’t watched.

Now there’s only about seven thousand users in this dataset. That’s nowhere near what we need to build a good neural net model but this would be a good exercise. Lets start out with importing some libraries.

# Importing tensorflowimport tensorflow as tf# Importing some more librariesimport pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error as MSE

Then I read the ratings data. It has user_id, movie_id, ratings and a timestamp. I drop the timestamp and pivot the data so that I have it at a user level and his movie ratings as features. Then I create train and test sets from the pivoted data.

# reading the ratings dataratings = pd.read_csv('ml-1m/ratings.dat',\
sep="::", header = None, engine='python')
# Lets pivot the data to get it at a user levelratings_pivot = pd.pivot_table(ratings[[0,1,2]],\
values=2, index=0, columns=1 ).fillna(0)
# creating train and test setsX_train, X_test = train_test_split(ratings_pivot, train_size=0.8)

So, right now I have the data at user level, with his ratings as the features. There are a total of 3706 movies. So our dataset had 3706 features per user. I replace all ratings not put in by the user with 0, it makes things simpler. Now lets start building the model.

# Deciding how many nodes wach layer should haven_nodes_inpl = 3706  
n_nodes_hl1 = 256
n_nodes_outl = 3706
# first hidden layer has 784*32 weights and 32 biaseshidden_1_layer_vals = {'weights':tf.Variable(tf.random_normal\([n_nodes_inpl+1,n_nodes_hl1]))}# first hidden layer has 784*32 weights and 32 biasesoutput_layer_vals = {'weights':tf.Variable(tf.random_normal\([n_nodes_hl1+1,n_nodes_outl])) }

I’ve defined how many nodes each layer has and what the weight matrix looks like. Notice that I’m not defining any bias matrix associated with the layers. That because I’m going to add a bias node instead to each layer which has a constant value of one. This is a bit unusual as to how I usually build a network. But Im sticking to the whats depicted in the original pic.

# user with 3706 ratings goes ininput_layer = tf.placeholder('float', [None, 3706])# add a constant node to the first layer
# it needs to have the same shape as the input layer for me to be
# able to concatinate it later
input_layer_const = tf.fill( [tf.shape(input_layer)[0], 1] ,1.0 )
input_layer_concat = tf.concat([input_layer, input_layer_const], 1)
# multiply output of input_layer wth a weight matrix layer_1 = tf.nn.sigmoid(tf.matmul(input_layer_concat,\
hidden_1_layer_vals['weights']))
# adding one bias node to the hidden layerlayer1_const = tf.fill( [tf.shape(layer_1)[0], 1] ,1.0 )
layer_concat = tf.concat([layer_1, layer1_const], 1)
# multiply output of hidden with a weight matrix to get final outputoutput_layer = tf.matmul( layer_concat,output_layer_vals['weights'])# output_true shall have the original shape for error calculationsoutput_true = tf.placeholder('float', [None, 3706])# define our cost functionmeansq = tf.reduce_mean(tf.square(output_layer - output_true))# define our optimizerlearn_rate = 0.1 # how fast the model should learn
optimizer = tf.train.AdagradOptimizer(learn_rate).minimize(meansq)

Usually I add a bias matrix instead of a bias node. But its supposed to act the same way. Alright, now that we’re done building the model, Ill define how many epochs I want to run and the batch size.

# initialising variables and starting the session
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)
# defining batch size, number of epochs and learning rate
batch_size = 100 # how many images to use together for training
hm_epochs =200 # how many times to go through the entire dataset
tot_users = X_train.shape[0] # total number of images

Now I’ll start the training. All Im doing is iterating through the data in batches, training the model and printing out the test error after each epoch.

# running the model for a 200 epochs taking 100 users in batches
# total improvement is printed out after each epoch
for epoch in range(hm_epochs):
epoch_loss = 0 # initializing error as 0

for i in range(int(tot_images/batch_size)):
epoch_x = X_train[ i*batch_size : (i+1)*batch_size ]
_, c = sess.run([optimizer, meansq],\
feed_dict={input_layer: epoch_x, \
output_true: epoch_x})
epoch_loss += c

output_train = sess.run(output_layer,\
feed_dict={input_layer:X_train})
output_test = sess.run(output_layer,\
feed_dict={input_layer:X_test})

print('MSE train', MSE(output_train, X_train),'MSE test', MSE(output_test, X_test))
print('Epoch', epoch, '/', hm_epochs, 'loss:',epoch_loss)

Im almost finished. Now to get the predicted rating for an user, all you have t do is pass it through the net once. If youre going to use the output to recommend that user some movies, you can pick the ones with the highest ratings that he hasnt seen yet,

# pick a usersample_user = X_test.iloc[99,:]#get the predicted ratings
sample_user_pred = sess.run(output_layer, feed_dict={input_layer:[sample_user]})

Thanks for making it to the end. Like and leave a comment if this was helpful to you.

--

--