Reciprocal Rank Fusion (RRF) explained in 4 mins — How to score results form multiple retrieval methods in RAG
Unlock the power of Reciprocal Rank Fusion in Retrieval-Augmented Generation.
Retrieval-augmented generation (RAG) is a powerful technique in natural language processing, combining the strengths of retrieval-based and generative models.
The retrieval stage can make or break your RAG pipeline.
If the retriever fails to fetch relevant documents from the retriever, then the precision is low, and the likelihood of the hallucination increases.
Some queries are suited for keyword-based retrieval techniques like BM25, whereas some may perform better with dense retrieval methods where we have embeddings from a language model. There are hybrid techniques to address the shortcomings of both retrieval methods.
In this blog post, we’ll dive deep into RRF, its mechanics, mathematical intuition, and application in RAG systems.
What is Reciprocal Rank Fusion?
Reciprocal Rank Fusion is a rank aggregation method that combines rankings from multiple sources into a single, unified ranking. In the context of RAG, these sources typically use different retrieval models or approaches.
The RRF Formula
The core of RRF is captured in its formula:
RRF(d) = Σ(r ∈ R) 1 / (k + r(d))
Where:
- d is a document
- R is the set of rankers (retrievers)
- k is a constant (typically 60)
- r(d) is the rank of document d in ranker r
How RRF Works in RAG
Let’s break down the process of using RRF in a RAG system:
1. User Query: The process begins when a user inputs a question or query.
2. Multiple Retrievers: The query is sent to multiple retrievers. These could be different retrieval models (e.g., dense, sparse, hybrid).
3. Individual Rankings: Each retriever produces its own ranking of relevant documents.
4. RRF Fusion: The rankings from all retrievers are combined using the RRF formula.
5. Final Ranking: A unified ranking is produced based on the RRF scores.
6. Generation: The generative model uses the top-ranked documents to produce the final answer.
Mathematical Intuition Behind RRF
Understanding the mathematical intuition behind RRF helps explain why it’s effective:
1. Reciprocal Ranking
Using 1/(rank + k), RRF gives more weight to higher ranks (lower rank numbers). This ensures that documents ranked highly by multiple retrievers are favoured in the final ranking.
2. Diminishing Returns
The contribution to the score decreases non-linearly as rank increases. This model shows the intuition that the difference in relevance between ranks 1 and 2 is likely larger than between ranks 100 and 101.
3. Rank Aggregation
By summing the reciprocal ranks across all retrievers, RRF effectively combines evidence from multiple sources. This makes the final ranking more robust and less susceptible to the quirks or biases of any single retriever.
4. Normalization
The constant k acts as a smoothing factor. It prevents any single retriever from dominating the results and helps handle ties more gracefully, especially among lower-ranked items.
The Mystery of k = 60
One aspect of RRF that often raises questions is the choice of k = 60. While this value isn’t set in stone, it’s commonly used due to several factors:
1. Empirical Performance
Studies have shown that k = 60 performs well across various datasets and retrieval tasks.
2. Balancing Influence
It provides a good balance between the influence of top-ranked and lower-ranked items. For example:
- For rank 1: 1/(1+60) ≈ 0.0164
- For rank 10: 1/(10+60) ≈ 0.0143
- For rank 100: 1/(100+60) ≈ 0.00625
3. Effective Tie-Breaking
k = 60 helps break ties effectively, especially for lower-ranked items where small differences in the original rankings might not be significant.
4. Robustness
This value has shown to be robust across different types of retrieval systems and data distributions.
It’s worth noting that while k = 60 is common, the optimal value can vary depending on the specific application and data characteristics. Some systems may benefit from tuning this parameter.
Pseudo-code implementation of RRF
Here’s a simple pseudo-code implementation of RRF:
function calculateRRF(rankings, k):
scores = {}
for each document d:
scores[d] = 0
for each ranker r:
rank = rankings[r][d]
scores[d] += 1 / (k + rank)
return scores
function getFinalRanking(scores):
return sort documents by scores in descending order
Visualization
Click on the image link to play with a simple RRF interactive visualizer.
Conclusion
Reciprocal Rank Fusion is a powerful tool in the arsenal of RAG systems. Its ability to combine rankings from multiple retrievers in a mathematically sound and intuitively appealing way makes it invaluable for producing robust and relevant document rankings. By understanding the mechanics and intuition behind RRF, practitioners can implement and potentially improve this technique in their RAG systems more effectively.
References
1. Cormack, G. V., Clarke, C. L., & Buettcher, S. (2009). Reciprocal rank fusion outperforms Condorcet and individual rank learning methods. In Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval (pp. 758–759). [Link]
2. Benham, R., & Culpepper, J. S. (2017). Risk-reward trade-offs in rank fusion. In Proceedings of the 22nd Australasian Document Computing Symposium (pp. 1–8). [Link]
3. Julian Risch, Timo Möller, Julian Gutsch, Malte Pietsch (2023). Semantic Answer Similarity for Evaluating Question Answering Models. arXiv preprint arXiv:2108.06130. [Link]
4. Lin, J. (2019). The Neural Hype and Comparisons Against Weak Baselines. ACM SIGIR Forum, 52(2), 40–51. [Link]
5. Zhuang, T., & Zuccon, G. (2021). Dealing with Biases in the Search Engine Evaluation. In Proceedings of the 2021 ACM SIGIR International Conference on Theory of Information Retrieval (pp. 185–193). [Link]