Milvus Integrated into WhyHow.AI’s Open-Source Rule Based Retrieval Package

Chia Jeng Yang
Published in
3 min readApr 29, 2024

We’re pleased to announce we’ve extended our open source rule based retrieval package to incorporate the use of Milvus.

This was done in conjunction with the Zilliz team, who were awesome to collaborate with. Given this is an open source package, we would encourage anyone to contribute.

The Rule-based Retrieval package is a Python package that enables you to create and manage Retrieval Augmented Generation (RAG) applications with advanced filtering capabilities. It seamlessly integrates with OpenAI for text generation and Pinecone or Milvus for efficient vector database management.

How does it work?

For more detailed information of how the package works, check out the Github repo here or a longer article about how it works here.

The rule-based retrieval SDK does a few things for the user:

Index & namespace creation — the SDK creates a vector database index and namespace on your behalf. This is where chunk embeddings will be stored.

Splitting, chunking, and embedding — when you upload pdf documents, the SDK will automatically split, chunk, and create embeddings of the document before upserting into the vector databaseindex. We’re leveraging Langchain’s PyPDFLoader and RecursiveCharacterTextSplitter for pdf processing, metadata extraction, and chunking. For embedding, we’re using the OpenAI text-embedding-3-small model.

Auto-filtering — using a set of rules defined by the user, we automatically build a metadata filter to narrow the query being run against the vector database index

What’s next

Looking forward, there are several features we’d like to add to improve the effectiveness and usability of deterministic retrieval workflows:

  • Integration with knowledge graphs

WhyHow.AI is streamlining knowledge graph creation + integration to bring more determinism and accuracy in RAG. While you can use this SDK as a standalone solution for building, customizing, and managing rules for your unstructured data retrieval, we believe this feature is best implemented alongside a knowledge graph. By deterministically linking text chunks and vector embeddings to their nodes, developers can benefit from the best of multiple solutions and reliably add trustworthy context to the semantic reasoning they perform in a knowledge graph.

  • Natural language rules

Developers should be able to leverage the power of LLMs to define and enforce deterministic retrieval rules using the power of natural language. By building a text-to-rule engine, we can enable users to build and manage rules in plain English, and even dynamically create and enable rules at runtime based on the question they’re asking their RAG system.

  • Segmentation by document sections and chunks

While this SDK allows users to write rules to filter retrieval by page numbers, it becomes even more powerful when users can explore and specify individual chunks, or specify document sections for filtering retrieval. This is an extraction and usability problem that we’re excited to solve.

WhyHow.AI is building tools to help developers bring more determinism and control to their RAG pipelines using graph structures. If you’re thinking about, in the process of, or have already incorporated knowledge graphs in RAG, we’d love to chat. On our roadmap, we will be linking deterministic rule-based access with knowledge graphs to ensure deterministic semantic retrieval. To get access to this as one of our design partners, ping us at, or follow our newsletter at WhyHow.AI. Join our discussions about rules, determinism and knowledge graphs in RAG on our newly-created Discord.

Check out the code on our github, and install the package using pip.