12 min readJul 27, 2024

Optimizing Text Retrieval with MultiVector Search and Payload-Based Reranking in Qdrant: A Case Study

Photo by Flora Oladipupo Inspired by Canva

Overview of Qdrant

Qdrant is a high-performance vector database and similarity search engine designed to handle high-dimensional vectors. It powers AI applications with advanced, open-source vector search technology, enabling efficient processing of embeddings or neural network encoders for tasks such as advanced search, recommendation systems, retrieval-augmented generation, and anomaly detection. Qdrant offers cloud-native scalability, ease of deployment, and cost efficiency with built-in compression options. It integrates with leading embedding models and frameworks, providing a versatile solution for various AI-driven applications.

Payload (Metadata) Based Reranking

A payload is additional information stored alongside vectors. It typically refers to the data associated with each vector (or entity) stored in the vector database. It can include various data types such as integers, floats, booleans, keywords, geographical coordinates, and datetime values, all represented in JSON format.

Payload (Metadata) Based ReRanking in Qdrant involves using additional metadata associated with vectors to refine and rerank search results. This feature particularly makes vector search extremely powerful by incorporating metadata. Initially, vectors are ranked based on semantic similarity when a search query is performed. Then, payloads (metadata) are used to further adjust the ranking based on specific criteria, such as user preferences, contextual information, or domain-specific filters. This ensures that the search results are relevant in terms of content and aligned with additional context provided by the metadata.

Multi-Vector Search in Qdrant

MultiVector Search refers to the capability of searching across multiple vector spaces or multiple vectors simultaneously. This is particularly useful in applications where data entities are represented by more than one vector, capturing different aspects or modalities of the data, that is, where a single query vector may not fully capture the search intent. For example, an image might be represented by one vector for its visual content and another for its textual description. By leveraging multiple vectors, it can incorporate various aspects of the query, enhancing the relevance and accuracy of the results. It is useful across these use cases: Multimedia Content Search, Image and Text Retrieval, Recommendation Systems, etc.

The advantages this presents include:

Improved search accuracy: In a scenario where products are searched using both their images and descriptions, Multi-Vector Search can significantly improve search accuracy by image vector and text vector. For example, a user is looking for a red dress with floral patterns. The image vector captures the visual features of the products, such as color, shape, and pattern, while the text vector encodes the semantic meaning from product descriptions, like “red,” “dress,” and “floral patterns.” By using both vectors, the search engine can match products that not only look similar to the query image (red dress with floral patterns) but also those that are described similarly in the text, ensuring a comprehensive match. This also works to reduce ambiguity, enhance contextual understanding.
Enhanced Reranking Capabilities: Consider an e-commerce platform where users search for products like clothing, electronics, or home decor. Enhanced reranking capabilities using Multi-Vector Search can significantly improve the quality of the search results. Say, a user is searching for “summer dresses” on an e-commerce platform, and the user has previously shown a preference for eco-friendly brands and has made several purchases from these brands in the past. Additionally, the user prefers dresses in light colors and flowing fabrics. The search engine first retrieves dresses based on visual features such as color, style, and fabric texture. This ensures that the initial results include a variety of summer dresses that match the visual style the user might be looking for. Simultaneously, the search engine matches product descriptions with keywords like “summer,” “light fabric,” “flowing,” and “dress.” This further narrows down the initial results to those that semantically align with the query. The system considers the user’s past interactions and purchase history. Products similar to those they have previously bought or interacted with receive a boost in ranking. Search results can be reranked based on a combination of scores from multiple vectors, improving the relevance and quality of the final results.
Rich Data Representation: Rich data representation in multi-vector search significantly enhances the ability to perform comprehensive and accurate searches across complex and multi-faceted datasets. In the context of a news agency, it maintains a large database of multimedia content, including articles, images, videos, and audio clips. Journalists and content creators frequently need to search this database to find relevant content for new stories or to repurpose existing material. By considering multiple vectors, the search engine provides a wide range of relevant content, including articles, images, videos, and audio clips.
Scalability and Performance: Multi-vector search operations can be parallelized, leveraging multi-core processors and distributed systems to maintain quick response times. Modern vector databases like Qdrant are optimized to handle multiple vectors per entity efficiently, ensuring high performance even with large-scale datasets.

Creating a Qdrant Collection

Creating a collection in Qdrant implies setting up a structured storage space within the Qdrant database to organize and manage vector data along with associated metadata (payloads). A collection in Qdrant is analogous to a table in relational databases but is specifically designed to handle high-dimensional vector data and enable efficient similarity searches. This article gives a walkthrough involving the setup of a Qdrant collection with appropriate payloads and examples of various payload filtering techniques.

Preliminary step: Set up Qdrant, making sure these requirements are fulfilled. One of which is the need to have Docker installed first, which can be installed locally by downloading from Docker’s official website.

Step 1: Installing Qdrant

Once Docker is installed and running, the terminal or command prompt should be opened. The Qdrant Docker image will be pulled from Docker Hub using the following command:

docker pull qdrant/qdrant

Afterwards, set up Qdrant by running the command docker run -p 6333:6333 qdrant/qdrant. The result is displayed in the output. This command starts a new container and maps port 6333 of the container to port 6333 on your host. This mapping allows access to the Qdrant server using http://localhost:6333 from the local machine.

docker run -p 6333:6333 qdrant/qdrant.
http://localhost:6333

Step 2: Creating the Collection

Creating a Qdrant collection for a text retrieval system. The collection will be called “articles”, with appropriate payloads, and various ways to filter the payloads. This part imports the necessary modules from the Qdrant Client library and initializes a connection to a Qdrant server running on localhost at port 6333.

from qdrant_client import QdrantClient
from qdrant_client.http import models

# Initialize Qdrant client
client = QdrantClient("localhost", port=6333)

# Define the collection name
collection_name = "articles"

This creates a new collection named “articles” with 384-dimensional vectors using cosine similarity.

Step 3: Defining Appropriate Payloads

For a text retrieval system, we’ll use the following payload structure:

title: string
content: string
author: string
date: string
category: string
tags: list of strings
views: integer
rating: float

Prepare a list of PointStruct objects. Each PointStruct represents a point in the vector space and includes:

id: A unique identifier for the point.
vector: The vector representation of the point (in this case, we’re using placeholder vectors of the same dimension as our collection).
payload: A dictionary containing the metadata associated with this point.

# Prepare the points with payloads
points = [
    models.PointStruct(
        id=1,
        vector=[0.1] * 384,
        payload={
            "title": "Introduction to Qdrant",
            "content": "Qdrant is a vector database management system...",
            "author": "John Doe",
            "date": "2023-07-01",
            "category": "Technology",
            "tags": ["database", "vector search", "Qdrant"],
            "views": 1000,
            "rating": 4.5
        }
    )]

Using the client.upsert method to add these points to the collection, the upsert operation will:

Insert new points if their IDs don’t already exist in the collection.
Update existing points if their IDs are already present in the collection.

Finally, we print the status of the upsert operation.

# Upsert the points into the collection
operation_info = client.upsert(
    collection_name=collection_name,
    points=points
)

Step 4: Payload Filtering Examples

Exact Match Filtering: This filter will return only the articles where the “category” field exactly matches “Technology”. It’s useful when precise matching is needed on specific fields.

# Exact Match Filtering

results = client.search(
    collection_name="articles",
    query_vector=[0.1] * 384,
    query_filter=models.Filter(
        must=[
            models.FieldCondition(
                key="category",
                match=models.MatchValue(value="Technology")
            )
        ]
    ),
    limit=10
)

Range Filtering: This filter returns articles with view counts greater than or equal to 500 and less than 2000. Range filtering is particularly useful for numeric fields when you want to find items within a specific range.

# Range Filtering
results = client.search(
    collection_name="articles",
    query_vector=[0.1] * 384,
    query_filter=models.Filter(
        must=[
            models.FieldCondition(
                key="views",
                range=models.Range(gte=500, lt=2000)
            )
        ]
    ),
    limit=10
)

Multiple Condition Filtering: This filter combines multiple conditions. It will return articles in the “Technology” category with a rating of 4.0 or higher. This is useful when you need to apply multiple criteria to your search.

# Multiple Condition Filtering
results = client.search(
    collection_name="articles",
    query_vector=[0.1] * 384,
    query_filter=models.Filter(
        must=[
            models.FieldCondition(
                key="category",
                match=models.MatchValue(value="Technology")
            ),
            models.FieldCondition(
                key="rating",
                range=models.Range(gte=4.0)
            )
        ]
    ),
    limit=10)

Array Contains Filtering: This filter checks if the “tags” array contains the value “Qdrant”. It’s useful when you have fields that contain arrays (like tags or categories) and you want to find items that include specific values in those arrays.

# Array Contains Filtering
results = client.search(
    collection_name="articles",
    query_vector=[0.1] * 384,
    query_filter=models.Filter(
        must=[
            models.FieldCondition(
                key="tags",
                match=models.MatchAny(any=["Qdrant"])
            )
        ]
    ),
    limit=10
)

Combining AND and OR Conditions: This filter combines AND and OR conditions. It will return articles that have at least 500 views AND are authored by either John Doe OR Jane Smith. The should clause acts like an OR condition, while the must clause acts like an AND condition.

# Combining AND and OR Conditions
results = client.search(
    collection_name="articles",
    query_vector=[0.1] * 384,
    query_filter=models.Filter(
        should=[
            models.FieldCondition(
                key="author",
                match=models.MatchValue(value="John Doe")
            ),
            models.FieldCondition(
                key="author",
                match=models.MatchValue(value="Jane Smith")
            )
        ],
        must=[
            models.FieldCondition(
                key="views",
                range=models.Range(gte=500)
            )
        ]
    ),
    limit=10
)

Negative Filtering (NOT condition): This filter excludes all articles in the “Sports” category. The must_not clause is useful when you want to exclude certain items from your search results.

# Negative Filtering (NOT condition)
results = client.search(
    collection_name="articles",
    query_vector=[0.1] * 384,
    query_filter=models.Filter(
        must_not=[
            models.FieldCondition(
                key="category",
                match=models.MatchValue(value="Sports")
            )
        ]
    ),
    limit=10
)

Prefix Matching: This filter will match articles whose titles start with “Intro”. Prefix matching is useful for partial string matching, especially at the beginning of text fields.

# Prefix Matching
results = client.search(
    collection_name="articles",
    query_vector=[0.1] * 384,
    query_filter=models.Filter(
        must=[
            models.FieldCondition(
                key="title",
                match=models.MatchText(text="Intro")
            )
        ]
    ),
    limit=10
)

These filtering techniques provide a powerful toolkit for refining search results in Qdrant. By combining these methods, one can create complex queries that precisely target the data needed, enhancing the effectiveness of the vector search system.

Performing a Multi-Vector Search

To perform a multi-vector search that includes payload filtering and reranks the final results using the payload, follow the above steps of creating a Qdrant Collection and payload but include Qdrant’s multi-vector indexing and search capabilities. It implies conducting a search operation using multiple vectors (query points) simultaneously within a vector database.

Following this example of a multi-vector search with a text retrieval case study, this will be achieved in two major steps: First, by performing a multi-vector search with payload filtering. Then, reranking the results using information from the payload.

Performing a Multi-Vector Search with Payload Filtering

Before going into multi-vector search, there is the need to prepare the data by converting texts to vectors using word embeddings — a library that can do this is SentenceTransformer. A Google Colab GPU Instance will be used to do this before proceeding to carry out the Multi-Vector Search.

import pandas as pd
from sentence_transformers import SentenceTransformer

# Load the dataset
data_path = '/path/to/your/text_retrieval_task_dataset.csv'
data = pd.read_csv(data_path)

# Load a pre-trained SentenceTransformer model
model = SentenceTransformer('paraphrase-MiniLM-L6-v2')

# Transform texts to vectors and store them in a new column
data['vector'] = data['text'].apply(lambda x: model.encode(x).tolist())

# Optional: Display the DataFrame to verify
print(data.head())

After generating the file containing the vector embeddings, activate the Docker, then import the necessary libraries: Qdrant Client for interacting with Qdrant. Then load the dataset.

import json
from qdrant_client import QdrantClient
from qdrant_client.http import models
import pandas as pd
from io import StringIO
import csv

# Initialize Qdrant client
client = QdrantClient("localhost", port=6333)

# Load the data
file_path = r'c:\Users\ADMIN\Documents\Data_Science\vectors.csv'

# Open the CSV file and read its contents
with open(file_path, 'r') as file:
    content = file.read()

# Use StringIO to handle the CSV content as if it were a file
csv_data = StringIO(content)

# Create a CSV reader object to process the CSV data
reader = csv.DictReader(csv_data)

# Convert the CSV data to a list of dictionaries for easy processing
data = list(reader)

A new collection named “quotes” is created in Qdrant. If the collection already exists, it’s deleted first. The collection is configured for 384-dimensional vectors using cosine distance. The script iterates through the CSV data, creating PointStruct objects for each entry. Each point includes an ID, vector, and payload (text, category, length, and sentiment). Simultaneously, it creates SearchRequest objects for each point. The prepared points are inserted into the Qdrant collection.

# Create a new collection
collection_name = "quotes"
# Check if collection exists and delete if it does
if client.get_collection(collection_name) is not None:
    client.delete_collection(collection_name)
# Create the collection
client.create_collection(
    collection_name=collection_name,
    vectors_config=models.VectorParams(size=384, distance=models.Distance.COSINE))
# Prepare points for insertion
points = []
search_requests = []
for item in data:
    vector = json.loads(item['vector'])
    points.append(
        models.PointStruct(
            id=int(item['id']),
            vector=vector,
            payload={
                "text": item['text'],
                "category": item['category'],
                "length": int(item['length']),
                "sentiment": item['sentiment']}))
    search_requests.append(
        models.SearchRequest(
            vector=vector,
            filter=models.Filter(
                must=[
                    models.FieldCondition(
                        key="category",
                        match=models.MatchValue(value=['inspirational'])),
                    models.FieldCondition(
                        key="sentiment",
                        match=models.MatchValue(value=item['sentiment']))]),
            with_payload=True,
            limit=2))
# Insert points into the collection
client.upsert(
    collection_name=collection_name,
    points=points)

The multi-vector search is performed using client.search_batch(). it uses exact match conditions to ensure that the search results are strictly relevant to the specified attributes of the data points. Each search includes filters for category and sentiment matching the original point.

The multi-vector search is performed using the search_batch method, allowing submission of multiple search requests at once. This is more efficient than sending individual search requests, especially when there’s a large number of queries to process. In each search request, one can specify payload filters. These filters allow narrowing down the search results based on metadata associated with each vector. For example, you might filter by category, sentiment, or any other attribute stored in the payload.

# Perform multi-vector search with payload filtering
search_result = client.search_batch(
    collection_name=collection_name,
    requests=search_requests
)

The batch search results are combined and deduplicated and the results are sorted by their initial similarity score.


# Combine and deduplicate results
combined_results = {}
for batch in search_result:
    for hit in batch:
        if hit.id not in combined_results or hit.score > combined_results[hit.id].score:
            combined_results[hit.id] = hit

# Convert to list and sort by score
results = list(combined_results.values())
results.sort(key=lambda x: x.score, reverse=True)

Rerank Results Using Payload Information

Implement a custom reranking function, rerank_score(), which adjusts the original similarity score based on payload information. It combines the original similarity score with length and sentiment factors.

# Rerank results using payload information
def rerank_score(hit):
    payload = hit.payload
    base_score = hit.score
    length_score = 1 - abs(payload['length'] - 45) / 45  # 45 is considered ideal length
    sentiment_score = 1 if payload['sentiment'] == 'positive' else 0.5

    final_score = base_score * (0.5 + length_score * 0.25 + sentiment_score * 0.25)
    return final_score

Apply this reranking to the results and sort them based on the new scores.

# Apply reranking
reranked_results = sorted(results, key=rerank_score, reverse=True)

Finally, print out the top 2 results after reranking, showing both the original and reranked scores.

# Print top 2 results after reranking
for i, hit in enumerate(reranked_results[:2], 1):
    print(f"{i}. ID: {hit.id}, Text: {hit.payload['text']}, "
          f"Original Score: {hit.score:.4f}, "
          f"Reranked Score: {rerank_score(hit):.4f}")

1. ID: 2, Text: A journey of a thousand miles begins with a single step, Original Score: 1.0000, Reranked Score: 0.9500
2. ID: 5, Text: The only thing we have to fear is fear itself, Original Score: 1.0000, Reranked Score: 0.8694

The search returned the top two results, both from the “Inspirational” category as specified in the filter conditions. Both results have the same original score (1.0000), which suggests that they were equally similar to the query vectors in terms of vector similarity. After reranking, the scores changed:

“A journey of a thousand miles begins with a single step” got a reranked score of 0.9500
“The only thing we have to fear is fear itself” got a reranked score of 0.8694

The reranking process considered additional factors from the payload, like the length of the text and sentiment. The fact that “A journey of a thousand miles begins with a single step” ended up with a higher reranked score suggests that it had a better combination of length and sentiment according to the reranking criteria.

This demonstrates how the reranking process can differentiate between results that initially appeared equally relevant based on vector similarity alone. The reranking takes into account additional metadata to provide a more nuanced ordering of results. The reranking can be fine-tuned by adjusting the weights of different factors in the rerank_score function. For example, you might decide that sentiment is more important than length, or vice versa.

This approach utilizes both the power of vector similarity search and the structured information in the payloads. The initial search uses vector similarity and basic payload filtering to find relevant results, while the reranking step fine-tunes these results based on additional payload information that might indicate quality or relevance. This method can be particularly effective when there is a need to balance pure content similarity with other factors that might indicate an article’s overall quality or relevance to users.

In this case, the reranking slightly favored the “A journey of a thousand miles” quote, possibly due to its length being above the ideal 45 characters or it having a more positive sentiment. This showcases how Qdrant can be used for sophisticated search operations that combine vector similarity with metadata filtering and custom ranking algorithms.

Conclusion

The article offers an exploration of Qdrant’s capabilities as a high-performance vector database tailored for advanced AI applications. Qdrant’s standout features, Multi-Vector Search and Payload (Metadata) Based Reranking, are instrumental in refining search functionalities and enhancing result accuracy. By enabling detailed payload configurations and simultaneous multi-vector searches, Qdrant caters to complex data queries where multiple data representations are essential for thorough analysis. Following the step-by-step walkthrough on setting up a Qdrant environment, including collection creation and payload filtering, offers guidance on how to effectively utilize Qdrant. Overall, Qdrant is a solution that not only streamlines AI-driven applications but also fosters an environment where search precision is markedly improved by integrating vector similarity with metadata insights.

The link to the code and dataset can be found in my GitHub page