Building Personalized Recommender Systems with Qdrant: A Comprehensive Guide

Rayyan Shaikh
8 min readNov 27, 2023
Building Personalized Recommender Systems with Qdrant: A Comprehensive Guide

In the current AI ecosystem, personalized recommender systems play a pivotal role in enhancing user experience by delivering content tailored to individual preferences. One tool that has gained prominence in building such systems is Qdrant. In this comprehensive guide, we will delve into the world of personalized recommender systems, exploring the principles behind Qdrant and providing practical insights with examples and code snippets.

Table of Contents

  1. Understanding Personalized Recommender Systems
  • The Importance of Personalization
  • Qdrant: An Overview

2. Getting Started with Qdrant

  • Installation

3. Setting Up Qdrant: Creating and Configuring a Collection

  • Connecting to Qdrant
  • Creating a Collection

4. Indexing Data into Qdrant: Building the Foundation

  • Generating Random Data
  • Indexing the Data into Qdrant

5. Enhancing Data with Payloads: Adding Context to Vectors

  • Generating Realistic Payloads
  • Adding Payloads to Indexed Data

6. Exploring Similar Vectors: Searching and Filtering in Qdrant

  • Generating a Query Vector
  • Searching for Similar Vectors
  • Filtering Search Results

7. Personalized Recommendations: Unveiling Advanced Features in Qdrant

  • Basic Recommendations
  • Advanced Recommendation with Positive and Negative Vectors
  • Fine-tuning recommendations with a Score Threshold
  • Recommendation Based on Specific Filters

8. Conclusion

Understanding Personalized Recommender Systems

Understanding Personalized Recommender Systems
Understanding Personalized Recommender Systems

In the AI landscape of 2023, personalized recommender systems are indispensable tools designed to enhance user experience by tailoring content recommendations to individual preferences. These systems operate on the principle that users with similar tastes will likely enjoy similar content. By analyzing user behavior, historical data, and preferences, recommender systems aim to predict and deliver content that aligns with the unique interests of each user.

The Importance of Personalization

The need for personalization arises from the information overload users face daily. Consider a streaming service like Netflix: a user interested in science fiction may not find romance films as engaging. Personalization mitigates this challenge by leveraging algorithms that learn from user interactions, providing content suggestions that are not only relevant but also likely to be appreciated.

Qdrant: An Overview

Qdrant, a powerful open-source vector database, stands out in the realm of personalized recommendations. It specializes in Approximate Nearest Neighbors (ANN) search, a technique that efficiently finds items or users with similar features. Qdrant is adept at handling large-scale datasets, making it an ideal choice for recommendation systems where quick and accurate similarity searches are crucial.

For a clearer understanding, let’s consider an example. Imagine an e-commerce platform recommending products based on a user’s browsing history. Qdrant, by efficiently retrieving similar products, ensures that the recommendations are personalized and align with the user’s preferences.

Getting Started with Qdrant

To implement personalized recommender systems with Qdrant, the first step is installing the library and preparing the data for indexing. This section will guide you through the installation process and the initial data set-up.

Installation

To get started with Qdrant, follow these simple installation steps:

Install Docker

Qdrant is available as a Docker image. Make sure you have Docker installed on your machine. If not, follow the instructions here.

Download Qdrant Image

docker pull qdrant/qdrant

Initialize Qdrant

docker run -p 6333:6333 \
-v $(pwd)/qdrant_storage:/qdrant/storage \
qdrant/qdrant

Install Qdrant Python Client

pip install qdrant-client

Install Required Libraries

pip install numpy
pip install faker
pip install faker_food

Setting Up Qdrant: Creating and Configuring a Collection

We’ll walk through the essential steps of connecting to Qdrant, creating a collection, and configuring its vector parameters. This initial setup is crucial for preparing the foundation of your personalized recommender system.

Connecting to Qdrant

To begin, we establish a connection to the Qdrant instance using the QdrantClient:

from qdrant_client import QdrantClient
from qdrant_client.http import models
import numpy as np

# Connect to Qdrant
client = QdrantClient(host="localhost", port=6333)

Here, we create an instance of the QdrantClient class, specifying the host and port where your Qdrant instance is running. Adjust the host and port values based on your Qdrant set-up.

Creating a Collection

Once connected, we proceed to create a collection within Qdrant. A collection serves as the container for the vectors that will be indexed and used for similarity searches. In this example, we’ll create a collection named “first_collection”:

# Create a collection
my_collection = "first_collection"
first_collection = client.recreate_collection(
collection_name=my_collection,
vectors_config=models.VectorParams(size=100, distance=models.Distance.COSINE)
)
print(first_collection)

The output of print(first_collection) provides information about the created collection, including its name, vector parameters, and other relevant details.

Indexing Data into Qdrant: Building the Foundation

With the Qdrant collection set up, the next crucial step is indexing your data. This involves converting your raw data into vectors and storing them in the Qdrant collection for efficient similarity searches. Let’s explore the process of indexing data into Qdrant using the provided code.

Generating Random Data

To illustrate the indexing process, we generate a set of random vectors representing your data:

import numpy as np

data = np.random.uniform(low=-1.0, high=1.0, size=(1_000, 100))
index = list(range(len(data)))

Indexing the Data into Qdrant

Now, we use the upsert method to index this data into the Qdrant collection:

# Index the data into Qdrant
data_list = client.upsert(
collection_name=my_collection,
points=models.Batch(
ids=index,
vectors=data.tolist()
)
)
print(data_list)

The data_list output provides information about the success of the operation, including details about the vectors that were added or updated.

Enhancing Data with Payloads: Adding Context to Vectors

In the realm of personalized recommender systems, adding context or metadata to your vectors can significantly enhance the quality of recommendations. This section explores the process of creating and incorporating payloads into your indexed data within Qdrant.

Generating Realistic Payloads

To simulate a more realistic scenario, we use the Faker library to generate diverse and context-rich payloads:

from faker import Faker
from faker_food import FoodProvider

# Instantiate Faker with FoodProvider for diverse food-related data
fake_something = Faker()
fake_something.add_provider(FoodProvider)

# Initialize an empty list to store payloads
payload = []

# Populate the payload list with diverse data for each vector
for i in range(len(data)):
payload.append(
{
"restaurant": fake_something.name(),
"ethnic_category": fake_something.ethnic_category(),
"dish_description": fake_something.dish_description(),
"dish": fake_something.dish(),
"url": fake_something.url(),
"year": fake_something.year(),
"country": fake_something.country()
}
)

In the above code, the payload list is populated with dictionaries, each containing diverse attributes such as restaurant names, ethnic categories, dish descriptions, and more. This variety of data mimics a scenario where vectors represent items, and the payloads provide additional context about each item.

Adding Payloads to Indexed Data

The final step involves incorporating these payloads into the existing indexed data in the Qdrant collection:

# Add payloads to the indexed data
payload_upsert = client.upsert(
collection_name=my_collection,
points=models.Batch(
ids=index,
vectors=data.tolist(),
payloads=payload
)
)

The upsert method is utilized again, but this time, the payloads parameter is configured with the generated payload list. This operation enriches the indexed vectors with additional context, creating a more comprehensive dataset for personalized recommendations.

Exploring Similar Vectors: Searching and Filtering in Qdrant

Now that we have indexed data with enriched payloads, let’s delve into the exciting capabilities of Qdrant by performing searches and filters. This section will guide you through the process of searching for similar vectors and applying filters to narrow down search results.

Generating a Query Vector

To initiate a search, we create a query vector fooding_poca representing a hypothetical user’s preferences. This vector is generated randomly within the same vector space as your indexed data:

fooding_poca= np.random.uniform(low=-1.0, high=1.0, size=100).tolist()

The query_vector serves as a reference point for similarity searches within your Qdrant collection.

Searching for Similar Vectors

With the query vector ready, we can perform a search to find vectors similar to the user’s preferences:

# Search for similar vectors
query_vector_search = client.search(
collection_name=my_collection,
query_vector=fooding_poca,
limit=3
)
print(query_vector_search)

The search method is used with the specified collection name (my_collection), the query vector, and a limit on the number of similar vectors to retrieve. The output provides information about the vectors that closely match the user’s preferences.

Filtering Search Results

In addition to basic searches, Qdrant allows you to apply filters to refine your results. In this example, we create a filter to find vectors associated with Australian foods:

# Filter search results
australian_foods = models.Filter(
must=[models.FieldCondition(key="country", match=models.MatchValue(value="Australia"))]
)

place_filter = client.search(
collection_name=my_collection,
query_vector=fooding_poca,
query_filter=australian_foods,
limit=2
)
print(place_filter)

Here, the query_filter parameter is configured with a Filter object specifying a condition: vectors must have a matching value of “Australia” in the “country” field. This operation narrows down the search results to vectors associated with Australian cuisine.

Personalized Recommendations: Unveiling Advanced Features in Qdrant

Advanced features in Qdrant elevate the personalization of recommendations. The provided code demonstrates different recommendation scenarios, from basic positive vectors to advanced techniques involving positive, negative, and filter-based recommendations.

Basic Recommendations

The foundational recommendation operation involves suggesting vectors similar to a given positive vector (e.g., a user’s preference). Here, we recommend 5 vectors similar to the one with ID 17:

# Basic recommendation
recommendation = client.recommend(
collection_name=my_collection,
positive=[17],
limit=5
)

print(recommendation)

Advanced Recommendation with Positive and Negative Vectors

To enhance recommendations, consider scenarios where both positive and negative vectors influence the results. This advanced recommendation utilizes a query vector (fooding_poca), positively affecting vector 17 and negatively affecting vector 120:

# Advanced recommendation with positive and negative vectors
p_and_n_recommedation = client.recommend(
collection_name=my_collection,
query_vector=fooding_poca,
positive=[17],
negative=[120],
limit=5
)

print(p_and_n_recommedation )

Fine-Tuning Recommendations with a Score Threshold

To refine the sensitivity of recommendations, a score threshold can be applied. This fine-tuning operation filters out recommendations below a specified similarity score (e.g., 0.22):

# Fine-tune recommendations with a score threshold
fine_tune = client.recommend(
collection_name=my_collection,
positive=[17],
negative=[120, 180],
score_threshold=0.22,
limit=5
)

print(fine_tune)

Recommendation Based on Specific Filters

For a more targeted approach, recommendations can be influenced by specific filters. In this example, recommendations are constrained to vectors associated with “New York”:

# Recommendation based on specific filters
filter_recommendation = client.recommend(
collection_name=my_collection,
query_filter=models.Filter(
must=[models.FieldCondition(key="country", match=models.MatchValue(value="New York"))]
),
positive=[17],
negative=[120],
limit=5)
print(filter_recommendation)

Conclusion

Building personalized recommender systems with Qdrant opens up a world of possibilities for enhancing user experiences. We began by understanding the essence of personalized recommender systems, recognizing their pivotal role in alleviating information overload and enhancing user engagement. Qdrant emerged as a robust open-source vector database, offering a sophisticated solution for Approximate Nearest Neighbors (ANN) search. Navigating through the steps of installation, data preparation, and indexing, we laid out the foundation for a personalized recommendation ecosystem. The integration of payloads enriched our dataset, providing a nuanced context to vectors. Exploring Qdrant’s search and filtering capabilities showcased its prowess in delivering highly personalized recommendations. In the realm of advanced features, Qdrant’s versatility in handling diverse recommendation scenarios became evident. As we conclude, the amalgamation of theoretical insights, practical implementations, and Qdrant’s advanced functionalities positions it as a valuable resource for developers aspiring to redefine user experiences through personalized recommender systems.

--

--

Rayyan Shaikh

Python Developer | AI & ML Researcher | Transforming Complex Concepts into Engaging Content