Hybrid search with Re-ranking

Sowmiya Jaganathan
4 min readJul 24, 2023

--

When it comes to setting up search algorithms, we rely on multiple techniques to ensure we get the most relevant results. However, combining the results of different methods can be a challenge.

Let’s take two major approaches for our experiment: semantic similarity and statistical methods like BM25 or TF-IDF. Semantic similarity and statistical methods have their strengths in retrieving information, depending on the query type. But finding a balance to prioritize which results should be displayed under specific circumstances isn’t easy.

In this scenario, directly sorting the search results becomes tricky due to the mismatch in document score ranges. Keyword search scores fall within a positive range, usually between 0 and some maximum value, depending on the relevance of the query. On the other hand, semantic searches (eg,.with cosine similarity) generate document scores between 0 and 1.

So, Today, we’ll explore how to effectively combine the results with re-ranking techniques.

Reciprocal Rank Fusion (RRF)

Reciprocal Rank Fusion (RRF) is a method used to combine results from different retrieval systems by leveraging the positions/rank of the documents.

RRFscore(d ∈ D) = Σ [1 / (k + r(d))]

# k is a constant that helps to balance between high and low ranking.
# r(d)is the rank/position of the document

Let’s understand with an example.

Let’s calculate RRF for each document and rerank:

In the reranked table, we could see Document-2 is in the top position followed by document-3 and document-4.

Re-ranking with Cross-Encoder model

Let’s recap Bi-Encoder models: The documents are encoded and stored in the embedding space. When the user query comes in, query is encoded at the run time to calculate similarity distance between the documents and the query.

Now, with the Cross-Encoder model, we pass the query and document simultaneously. The model encodes and generates contextual embeddings for each word in the input sequences. It combines the contextual embeddings of both sequences to create a joint representation that captures the interactions between the input pairs. Then, the classification layer uses this joint representation to predict whether they are related or not related based on the interactions.

This leads to more accurate representations and better performance than Bi-encoder.

sbert

Let’s understand this with an examples: With the list of corpus, we re-ranked them based on the scores obtained from both the Cross Encoder and Bi Encoder models.

#sample script
from sentence_transformers.cross_encoder import CrossEncoder
model = CrossEncoder('model_name_or_path')
scores = model.predict([["text", "sentence pair"],
["text", "sentence pair"]])

# Cross-Encoder
Query: king rules the country
0.75 The monarch holds sway in the nation.
0.59 Majesty commands the nation.
0.59 Sovereign reigns over the realm.
0.58 Ruler leads the land.
0.51 Emperor controls the country.
0.50 Monarch governs the nation.
0.27 The country cherishes its royal heritage.


#Bi-Encoder
Query: king rules the country
0.67 Monarch governs the nation.
0.63 The monarch holds sway in the nation.
0.52 Ruler leads the land.
0.49 Emperor controls the country.
0.47 Majesty commands the nation.
0.46 Sovereign reigns over the realm.
0.46 The country cherishes its royal heritage.

While both were able to perform good except the last sentence which conveyed a different meaning despite being semantically similar to query. Cross-Encoder were able to capture the difference, while the Bi Encoder provided a similar score as the sentences were semantically similar.

Now we understood the concept, let’s see the full picture once.

Thank you for reading. Stay Tuned!

--

--