Recommendation System: Using Deep Learning

@Omkarade
3 min readJan 3, 2023

--

Part 2 : Solving Cold Movie Problem Using BERT Embeddings

A movie recommendation system, or a movie recommender system, is an ML-based approach to filtering or predicting the users’ film preferences based on their past choices and behavior. It’s an advanced filtration mechanism that predicts the possible movie choices of the concerned user and their preferences towards a domain-specific item, aka movie. The data used for this task is the MovieLens data set.

Model Architecture :

X =[userId,movieId,genres] ,Y=[rating] first I load data :

PATH = '/ml-latest-small/'
ratings = pd.read_csv(PATH + 'ratings.csv')
movies = pd.read_csv(PATH + 'movies.csv')
movies.head()

this data set contains - movieId ,userId ,title ,genres ,rating

ratings = pd.merge(ratings, movies[['genres','movieId']], on="movieId")
user_enc = LabelEncoder()
ratings['user'] = user_enc.fit_transform(ratings['userId'].values)
n_users = ratings['user'].nunique()
item_enc = LabelEncoder()
ratings['movie'] = item_enc.fit_transform(ratings['movieId'].values)
n_movies = ratings['movie'].nunique()
ratings['rating'] = ratings['rating'].values.astype(np.float32)
min_rating = min(ratings['rating'])
max_rating = max(ratings['rating'])

for model training I use only movieId ,userId , genres .and movieId ,userId convert using LabelEncoder() and genres convert into 1X1 vector format using BERT Embedding. because, if I use one hot encoding or w2v then the vector becomes dense.

genres converting into Bert Tokenizer

tokenizer = BertTokenizer.from_pretrained('bert-base-uncased', do_lower_case=True)
new=[]
for i in movies['genres'].values:
gen1=tokenizer(i)
gen2=gen1['input_ids']
gen2.pop(0)
gen2.pop(-1)
new.append(sum(gen2))
movies['genres_token']=new

here are some genres converted data -

split data into train test-

X = ratings[['user', 'movie','genres_token']].values
y = ratings['rating'].values
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.1, random_state=42)
#Embeddings 50 dim
n_factors = 50
X_train_array = [X_train[:, 0], X_train[:, 1],X_train[:, 2]]
X_test_array = [X_test[:, 0], X_test[:, 1],X_test[:, 2]]

Building Model-

I use EmbeddingLayer for all three inputs [user,movie,genres].then I got 50 dimensional vector for each input data .then I Concatenate [movie,genres] Embedded vector’s and its output Concatenate with [user] vector and its output connect with some Dense layers added Dropout. I used mean_squared_error loss function. Adam Optimizer. and train 15 epoch

from keras.layers import Add, Activation, Lambda
class EmbeddingLayer:
def __init__(self, n_items, n_factors):
self.n_items = n_items
self.n_factors = n_factors

def __call__(self, x):
x = Embedding(self.n_items, self.n_factors, embeddings_initializer='he_normal',
embeddings_regularizer=l2(1e-6))(x)
x = Reshape((self.n_factors,))(x)
return x

from keras.layers import Concatenate, Dense, Dropout
def RecommenderNet(n_users,n_movies,gen,n_factors, min_rating, max_rating):

user = Input(shape=(1,),name='user')
u = EmbeddingLayer(n_users, n_factors)(user)

movie = Input(shape=(1,),name='movie')
m = EmbeddingLayer(n_movies, n_factors)(movie)

genres = Input(shape=(1,),name='genres')
g = EmbeddingLayer(gen,n_factors)(genres)

n = Concatenate()([m,g])


x = Concatenate()([u,n])
x = Dropout(0.5)(x)


x = Dense(42, kernel_initializer='he_normal')(x)
x = Activation('relu')(x)
x = Dropout(0.5)(x)



x = Dense(1, kernel_initializer='he_normal')(x)
x = Activation('sigmoid')(x)
x = Lambda(lambda x: x * (max_rating - min_rating) + min_rating)(x)
model = Model(inputs=[user,movie,genres], outputs=x)
opt = Adam(lr=0.001)
model.compile(loss='mean_squared_error', optimizer=opt)
return model
model = RecommenderNet(n_users,n_movies,gen,n_factors, min_rating, max_rating)
history = model.fit(x=X_train_array, y=y_train, batch_size=32, epochs=15,verbose=1, validation_data=(X_test_array, y_test))

after 15 epoch lowest val loss is 0.71

Demo-

References

1] https://medium.com/@jdwittenauer/deep-learning-with-keras-recommender-systems-e7b99cb29929

2] https://labelyourdata.com/articles/movie-recommendation-with-machine-learning

Connect me — linkedin , github

--

--