RankNet, LambdaRank TensorFlow Implementation — part II

Published in

The Startup

3 min readFeb 3, 2021

In part I, I have go through RankNet which is published by Microsoft in 2005. 2 years after, Microsoft published another paper Learning to Rank with Nonsmooth Cost Functions which introduced a speedup version of RankNet (which I called “Factorised RankNet”) and LambdaRank.

However, before I jump to talk about Factorised RankNet and LambdaRank, I’d like to show you how to implement RankNet using custom training loop in Tensorflow 2.

This is important because Factorised RankNet and LambdaRank cannot be implemented just by Keras API, it is necessary to use low level API like TensorFlow and PyTorch as we will see later.

Custom Training Loop

Remember in part I, the model training is actually done by 2 lines of code

ranknet.compile(...)
ranknet.fit(...)

In order to write our own training loop, we need to understand what is done behind these 2 lines of code, this is outlined in the pseudo code below

Now we know what need to be done. Let’s break down the whole process into a few functions:

apply_gradient function which do line 3 to 6
train_data_for_one_epoch function which loop through batches, line 2–7
compute_val_loss function which do line 9
an outer loop on epochs which put everything together

Let’s see how to implement these.

apply_gradient

line 4 computes Pij
line 5 computes the loss
line 7 gets the gradients of all trainable weights
line 8 apply back propagation to update the weights

train_data_for_one_epoch

line 5 converts the function apply_gradient to graph mode
line 7 print a progress bar
line 8 loops through the training set batch by batch
line 10 stores the loss of current batch

compute_val_loss

This function simply loop through validation set and compute the loss.

Putting all these together

line 3 & 4 define the data pipeline
line 8 define the loss function and we use binary crossentropy
line 9 define the optimizer
line 19 start the training of RankNet

From the results below, we can see the training result is very similar to the one in part I.

Benefit of Custom Training Loop

Now you may be wondering what’s the advantage of using a custom training loop. Let me show you how we could be more memory efficient after using custom training loop.

Recall that in Part I when we generate the data, we first generate doc_features and then transform it into pairs, xi and xj before feeding into RankNet. This means that for one document, it will be repeated several times in the memory.

For example, if we have two queries, q1 and q2:

q1 has three documents, d1, d2, d3
q2 has four documents d4, d5, d6, d7

Then we will have following pairs

q1’s pairs: d1 & d2 | d1 & d3 | d2 & d3
q2’s pairs: d4 & d5 | d4 & d6 | d4 & d7 | d5 & d6 | d5 & d7 | d5 & d7

To be more memory efficient, we should only get the necessary document on the fly during training. This could be achieved by modifying two functions as below:

line 8 doesthe trick

line 4 does the trick

Now we get rid of the variables xi, xj, pij completely which means we are not repeating any document in the memory. If you use these functions to train the model, it will work exactly the same!

Conclusion

Now we implemented RankNet using a custom training loop in TensorFlow 2.

In part III, I will talk about how to speed up the training of RankNet and the implementation.

Stay Tuned!