Handling Position Bias for Unbiased Restaurants Ranking

Mehdi Bennaceur
The Glovo Tech Blog
5 min readFeb 11, 2022

In modern, multi-sided, marketplaces, ranking and recommender systems play a huge role in helping customers discover content. In Glovo, a large selection of stores are available and are explorable through different discoverability components in the app. One of these is the restaurant wall, which allows our customers to explore a large variety of restaurants.

When ranking systems are trained, it can be difficult to distinguish between content that is truly relevant for a customer, and content that happens to be displayed first. Indeed, when we analyze customer behavior in this part of the app, we observe a tendency to order from restaurants that are ranked high on the wall. This can be seen by looking at the conversion rate, defined as the probability of placing an order given an impression, per position.

This exponential decrease in conversion rate as the rank increases can be explained by several factors [2], the two most important ones are:

  1. The stores in top positions are superior to the others in terms of their relevance to the customers.
  2. Position bias, which expresses the fact that customers are more likely to order from stores in top positions, regardless of their relevance.

A Feedback Loop

Glovo’s ranking system relies in part on a machine learning model that estimates the probability of customers ordering from a store. Training data is gathered from historical data, which is inherently biased: positive signals recorded in the data are not only due to the relevance of the store itself but also to the position in which it was displayed. Indeed, people are generally biased towards clicking on higher ranked content, either due to laziness, habits, or trust bias. Subsequently, our conversion model is going to be biased, and when used within the ranking system, it is going to reinforce the existing bias, thereby creating a self-reinforcing feedback loop.

This phenomenon is bad for our marketplace as it maintains the system in a suboptimal spot with respect to a key metric, conversion rate. It also makes it more difficult for the system to surface relevant content that was historically ranked lower or that is new to the platform. To address this issue, we must first measure or estimate this position bias.

Estimating the bias

Fortunately, this problem is well known in ranking and recommendation systems and has been studied extensively in scientific literature [1, 2]. We adapted a popular model called, unsurprisingly, position bias model [1], to our specific use case. It assumes that the observed order Bernoulli variable O depends on two other hidden Bernoulli variables E and R, where E represents the event of a customer examining a store at a certain position k, and R represents the event of a store being relevant given a certain context. More formally, we can write the model as follows.

The first term on the right-hand side can be interpreted as the degree of attention customers give to position k. This is our estimate for position bias. The second term can be interpreted as the inherent relevance of the store in that particular context. The context feature represents the conditions in which the store is presented, it may include customer preferences profile and temporal information. For example, a burger restaurant will certainly not have the same relevance in the morning and at dinner time, on a weekend or a weekday. We estimate the parameters of this model by maximum likelihood estimation using Expectation Maximization.

As expected, position bias is highest in the first few positions on the Restaurant Wall and decreases exponentially thereafter, while being less steep than the conversion rate curve. Put another way, the lower the position, the less attention customers are giving to the content. If we correct for this bias in our training data, we have the opportunity to estimate in a more accurate way the relevance of different restaurants, thereby offering our customers much more tailored content.

Results with an unbiased model

Using the estimated biases for each position, we can de-bias our training data and therefore our machine learning model. But how does it perform in practice? To evaluate the real-world impact of unbiasing our model, we designed a controlled experiment (AB test) in which 50% of our customers saw content ranked by our unbiased system, and 50% of customers were held out as a control group. The results showed a statistically significant lift of 1.5% in order conversion rate in the treatment group, relative to the control group. Considering the change consisted solely in de-biasing our training data, and did not involve any UI or complex model change, this result is remarkable.

Conclusion

When biased data is consumed by a machine learning model used in production, it may result in the creation of a feedback loop reinforcing the initial bias. This may result in missing out on significant opportunities to improve key business metrics and provide better products to customers. We have just seen a real life example of this and how to address it.

References

[1] Position Bias Estimation for Unbiased Learning to Rank in Personal Search

[2] Accurately Interpreting Clickthrough Data as Implicit Feedback

Acknowledgments

Special thanks to Anna Via, Victor Bouzas for their contribution to this work and to the Glovo Data Science community that provided helpful suggestions and comments on previous drafts of this post.

--

--