Dynamic Expert Feedback: Knowledge Graph RAG as a Two-Way Retrieval & Memory System vs Vector-Only RAG as a One-Way System

Chia Jeng Yang
WhyHow.AI
Published in
5 min readSep 19, 2024

We have a structural problem in AI. While LLM and AI systems are getting better and better, one problem still stands out in particular — context injection.

Context can come from:

  • Documents (Static digitized knowledge base)
  • Expert Feedback (Dynamic & current non-digitized implicit knowledge)

Implementations of Vector-Only RAG systems today are a ‘One-Way’ street. The common implementation pattern is a multi-agent system on top of a vector database so natural language queries can retrieve the right information for the LLM to construct an answer. This data typically from documents and databases can tell us what the data is, but not how the data should be used. For example, if we discover that the knowledge base contains an outdated fact i.e. “That manual should only be used for customers in Asia”, that information or expert knowledge cannot be easily considered by a Vector RAG system, the way that humans can.

This type of problem space around expert knowledge infra is what we’re building up to. RAG is just the starting point of KGs. The point of a schema is that it bridges the gap between how the LLM and a human expert interprets data. This allows us to create ‘Two-Way’ RAG systems that allow your enterprises to capture and store expert feedback about how your information should be interpreted, and used moving forward.

A KG RAG system can then be a two-way street where the human can use natural language to directly contribute to the KG memory in ways that cannot really work for a vector RAG system, which can be viewed as a one-way retrieval system. ‘Memory’ in vector systems is just adding additional embeddings for retrieval, and not in any way affecting the meaning and status of the existing embeddings. The fact that the graph is also human readable is an additional benefit.

We want to map implicit knowledge, not in an arbitrary way, but in a way that ties the expert feedback to your existing knowledge base. Put another way, expert feedback is always tied to the context and the knowledge base you are using. The value of expert feedback is then only enabled if you are able to easily capture it and systematically tie it back to the context, not simply throw it into a knowledge base where it cannot be reliably retrieved at the right time.

In order to make sure that we can systematically tie expert feedback back into a knowledge base, we need to make sure that the language that experts use, can be systematically tied back into how the LLM interprets and stores data within memory.

Schemas are relevant as the structural medium that allows both the LLM and experts to ‘speak’ the same precise language. Schemas are precise representations that help direct the feedback to the particular knowledge where the feedback should be stored beside.

For example, in the case of a RAG system around manufacturing manuals, we can imagine that there can be a disconnect between stored digital knowledge from static documents, and the realities of how the physical system works. Being able to capture how that difference in real time through human feedback is important. The schema helps determine how exactly the schema should affect the knowledge base.

These RAG systems are then not just valuable in itself, but can serve as a gateway to interact with employees and collect up-to-date information and employee heuristics about how such information should be utilized.

This example above shows how you can give feedback to a Knowledge Graph RAG system across a range of dimensions:

  • Being able to amend an existing graph to add a missing step in the process
  • Being able to remove relationships in a document graph that maps how a document is relevant to specific scenarios
  • Being able to spin up and amend a personalized graph that reflects the specific user’s background

Although there are ways for us to change and tune the embeddings through fine-tuning, scenarios that rely on RAG are sensitive to the real-time nature of the knowledge base. When we are building a RAG system instead of a model, we are already making a decision that fine-tuning against a dynamic data-set is not worth the effort/reward. As such, it would be difficult to make the argument that fine-tuning expert feedback (which tends to be very specific, and related to specific parts of the knowledge base, as opposed to generally applicable) would be a practical alternative.

Instead of therefore having a binary approach between a static knowledge base (pre-LLMs), and fine-tuning a model (expensive and difficult against dynamic data-sets), a third option emerges, which is about creating an intermediary semantic layer between the expert, vector embeddings, and the LLM that allow for expert feedback to intelligently amend the knowledge graph representation that sits between the vector embeddings and the LLM.

This is not something that can only be done from the ground up. We do not intend to mean that every system must only have a graph or that we need a schema that has to be fully defined before feedback can be introduced (although that is the most ideal set-up). There are a number of ways this schema can be built up iteratively. For example, we can start with a vector database and a sparse schema (i.e. a document graph), and then build more granular schemas where needed. We can also iteratively build schemas based on questions, answers, and feedback. This process involves using LLMs to build a schema to store relevant questions, its answers, and the feedback provided, with the initial information coming from vector RAG.

It is far easier to build these KG systems (with Hybrid Vector RAG) from the ground up, but if you have an existing Vector RAG system and are interested in augmenting your existing systems with a graph representation on top, hit us up as we are doing some R&D here.

WhyHow.AI’s Knowledge Graph Studio Platform (currently in Beta) is the easiest way to build modular, agentic Knowledge Graphs, combining workflows from LLMs, developers and non-technical domain experts.

If you’re thinking about, in the process of, or have already incorporated knowledge graphs in RAG for accuracy, memory and determinism, we’d love to chat at team@whyhow.ai, or follow our newsletter at WhyHow.AI. Join our discussions about rules, determinism and knowledge graphs in RAG on our Discord.

--

--