In this article, we will explore a variety of matrix factorization models, and how to optimize them with gradient descent. — This repo/jupyter notebook contains all the runnable code that goes along with this article. Before we start off, this article assumes you have at least a very basic understanding of collaborative filtering, matrix factorization, gradient descent and neural networks. To refresh your memory on these subjects, let’s relate them: