Note: In AutoRecs paper, they trained once per item — Feeding in ratings from each user for that item in input layer.
Sigmoid activation function was used in output layer.

  • Slightly better results than RBM
  • Here, compared to RBM, we have different sets of weights as well(not just different sets of biases)
  • Easier to implement in modern frameworks such as TF or Keras.

THE SPARSITY-CHALLENGE

In the research paper it was clearly mentioned that “We only consider contribution of observed ratings”

i.e. They were careful to process each path through the neural net individually only propagating information from ratings that actually exist in training data and ignoring contribution input nodes corresponding
to missing data.

As a result, It is hard to implement it in Tensorflow. Even if it has sparse tensors, there is no simple way to restrict chain of matrix multiplications and additions needed to implement in neural net to just input nodes with
actual data in them. Implementation in TF or Keras ignores that problem and models missing ratings as zeros. This gives decent results but has a fundamental problem in it!

ARCHITECTURE — AUTOENCODERS

Encoding Inputs: Building up weights and biases between input and hidden layer is refered to as encoding the input. We are actually encoding patterns.

Decoding Outputs: Reconstructing output in the weights between hidden and the output layers

Hence, First set of weights is used in encoding stage and second set of weights is used in Decoding stage.

Note: In RBM we encoded on forward pass and decoded on backward pass.
So, conceptually both are similar.

SOME OTHER IDEAS

  • Deep neural nets with large number of hidden layers
  • One hot encoding user and item data into one input layer

One big problem is that there is no difference between missing ratings and ratings with zero values!

Note: The above idea was unable to outperform matrix factorisation because of data sparsity

Amazon’s DSSTNE actually solved the problem of handling missing ratings! RESULTS ARE VERY GOOD!!

  • Our implementation in github code is different than in paper
  • Cosidering missing values as zero ratings is messing our predictions. Our network can’t differentiate between a missing rating and a zero rating. To put it in other words, it thinks that people hate pretty much everything!!
  • Complex algorithm won’t help; Unless we have enough data

Note: Till now in our github implementation, RBMs and AUTORECS failed our expectations. But more is to come.

SESSION BASED RECOMMENDATIONS WITH RNN

Complex! Complex! Complex! A very complicated approach!!

To be specific, Recurrent Neural Nets are

  • Good at patterns and sequences of data (Predicting what comes next)
  • Complicated beasts
  • Instead of neurons, RNN depend on complicated structures such as LSTMs or GRUs(Gated Recurrent Units)
  • One good example is GRU4Rec

Note: This is a complex solution. The more simple our solution is, the better it will be.

DEEP FACTORISATION MACHINES

  • More general purposed than SVD
  • Sometimes finds features that SVD can not find
  • Combination of Deep Neural Nets and Factorisation Machines
  • It is a hybrid approach which takes best of both worlds and outperform the both!

Note: Uses HIGHER-ORDER FEATURE INTERACTIONS. There are lower-order features as well but deep nets are better with higher-order ones.

Note: Recommender Algorithms can perpetuate biases that exist in training data and even amplify them!!

It is an ensemble approach — a mixture of two or more different algorithms

  • It is better to run two different alorithms in parallel and combine results in the end
  • The above mentioned parallel approach is not only good for accuracy but also works as backup if one of the algorithms fail!

--

--