From Conventional RAG to Graph RAG
When Large Language Models Meet Knowledge Graphs
Introduction
As we know, Large Language Models (LLMs) operate on fixed datasets with their knowledge frozen at the point of their last training update.
Regular users of ChatGPT might recognise the well-known limitation: “Training Data Up to Sep 2021”.
This limitation can lead to inaccuracies or outdated responses, as these models “hallucinate” information.
It can be challenging, both in terms of resources and manpower to update them with new information or enhancing context comprehension without retraining or fine-tuning.
Retrieval-Augmented Generation (RAG)
Retrieval Augmented Generation (or RAG for short) was introduced as a technique to improve Large Language Models’ (LLMs) by incorporating information from external, reliable knowledge bases.
The principle behind RAG is straightforward; when an LLM is asked a question, it does not just rely on what it already knows.
Instead, it first looks up relevant information from a specified knowledge source.
This approach assures that the generated outputs references from a vast amount of contextually enriched data, augmented by the most current and relevant information available.
RAG primarily functions through a two-phase process: retrieval and content generation.
Retrieval Phase
During the retrieval phase, the algorithm locates and gathers relevant snippets of information pertinent to the user’s prompt or inquiry.
For instance, if you’re searching for the recipe for Hokkien Mee, your prompt might be “What are the ingredients for Hokkien Mee?”
The system identifies documents with content semantically related to the query and calculates relevance using a similarity measure, typically the cosine similarity (Is Cosine-Similarity of Embeddings Really About Similarity?) between their vectors.
After collating external knowledge, it appends this to the user’s prompt and sends it as an enriched input to the language model.
Content Generation Phase
In the subsequent generative phase, the LLM combines this augmented prompt with its own training data representation to produce a response that is customised to the user’s query.
This response provides a mix of personalised and verifiable information, suitable for use through applications such as chatbots.
Why RAG Matters
In today’s technology landscape, LLMs drive numerous natural language applications capable of understanding or generating human-like text.
While incredibly powerful, these models can sometimes fall short.
At times, they can be overly confident in generating incorrect responses, that one might be easily persuaded by their extremely cogent arguments.
RAG attempts to mitigate these issues by guiding the LLM to draw on information from trusted sources, thus maintaining the model’s outputs relevance and accuracy.
Limitations of RAG
As with all things in life, the conventional RAG approach has its complexities and challenges.
While groundbreaking in enhancing the capabilities of LLMs, RAG also has certain limitations that can impact their effectiveness and applicability.
One of the main challenges involves the accuracy of retrieved information and data source heterogeneity.
For RAG to be effective, it often relies on multiple external sources, which may come in varying formats, standards, and levels of reliability. (think PDFs, Flat files, Markdown, CSVs, Web content etc)
RAG implementations also encounter difficulties with ambiguous queries or those requiring a deep understanding of context.
These issues, intrinsic to the technology’s design, mainly stems from the retrieval process where it occasionally overlook nuances that are necessary for precise responses.
Improvements to RAG
Improving the retrieval accuracy and efficiency of RAG systems is an ongoing area of research within natural language processing and machine learning.
There are several strategies that could be pursued to achieve these improvements but I would like to highlight two notable enhancements that could be achieved with today’s technology.
1. Implementing more sophisticated retrieval algorithms that can better understand the semantics of a query could improve the relevance of fetched documents.
2. Efficiently indexing the knowledge base to speed up the retrieval process without sacrificing the quality of the results.
Which leads us to…
Graph RAG: RAG x Knowledge Graphs
Graph RAG builds on the concept of RAG by leveraging on knowledge graphs (KGs).
This innovative approach, a concept pioneered by NebulaGraph transforms the way LLMs interpret and respond to queries through the integration of graph databases.
Graph RAG operates by integrating the structured data from KGs into the LLM’s processing, providing a more nuanced and informed basis for the model’s responses.
KGs are structured representations of real-world entities and their relationships.
They consist of two main components: nodes and edges.
Nodes represent individual entities such as people, places, objects, or concepts.
Edges, on the other hand, represent the relationships between these nodes which indicates how they are connected to each other.
This structure greatly improves LLMs’ capacity to generate informed responses by enabling the models to access precise and contextually relevant data.
The innovation of Graph RAG lies in its integration of graph databases with LLMs to enrich the model’s context before generating a response.
Some popular graphDB offerings are Ontotext, NebulaGraph and Neo4J.
What Makes Graph RAG Significant?
As LLMs continue to grow in sophistication and capability, Graph RAG has the potential to make a substantial impact on the AI landscape.
Here’s how I envision this integration evolving:
1. Future LLMs are expected to exhibit improved understanding of complex queries and reasoning capabilities.
Graph RAG can leverage these advancements to offer more precise and context-rich answers.
The structured knowledge from knowledge graphs, combined with more sophisticated LLMs, could lead to breakthroughs in AI’s ability to grasp abstract ideas, reason through them, and produce nuanced responses.
2. With the seemingly inexorable progress of LLMs, their integration with knowledge graphs will likely become more dynamic and seamless.
This could involve real-time updates to KGs based on global events or discoveries.
LLMs could play a role in automatically enhancing and updating knowledge graphs by incorporating new information gathered from user interactions or other data sources.
Using techniques such as Reinforcement learning from human feedback (RLHF) and Reinforcement learning from AI feedback (RLAIF) can further help align models with human preferences and adhere to HHH (not the wrestler, but helpfulness, honesty, harmlessness) principles.
3. With Nvidia’s effort to democratise AI computing, future advancements in LLMs and Graph RAG implementations will focus on improving computational efficiency and scalability.
This shift will allow Graph RAG to be used in a wider variety of applications, including those requiring real-time responses or operating in settings with limited resources.
4. LLMs are expected to have broader and deeper knowledge across multiple domains in the future. Graph RAG could facilitate the transfer of knowledge across different fields, which would enable the generation of insights or solutions that draw upon information from disparate fields.
For instance, applying findings from cognitive science could lead to the development of more natural human-robot interaction models, or combining cybersecurity with psychology might improve security measures’ efficacy.
5. As Graph RAG technology evolves, adopting standards like the Resource Description Framework (RDF) for knowledge graphs can improve interoperability among various systems.
This could imply different implementations interacting and collaborating together, subsequently pushing the drive towards wider adoption and innovation.
Graph RAG Demo
For this demo, we will be using the product information from Govtech’s Developer Portal as our knowledge base.
1. Setup
- Start a Neo4j local instance with Neo4j Desktop
- Connect to the Neo4j Database locally using LangChain. The good news is LangChain has a ready-to-use template for easy setup.
2. Extraction
- Use prompt engineering and LLM to extract information, nodes, and their connections. An example of the prompt below:
# Instructions for Creating Knowledge Graphs
## Overview
You are engineered for organising data into knowledge graphs.
- **Nodes**: Represent entities and ideas.
- The objective is to ensure the knowledge graph is straightforward and intelligible for broad use.
## Node Labeling
- **Uniformity**: Stick to simple labels for nodes. For instance, label any entity that is an organisation as "company", rather than using terms like "Facebook" or "Amazon".
- **Identifiers for Nodes**: Opt for textual or comprehensible identifiers over numerical ones.
- **Permissible Node Labels**: If there are specific allowed node labels, list them here.
- **Permissible Relationship Types**: If there are specific allowed relationship types, list them here.
## Managing Numerical Data and Dates
- Integrate numerical information directly as attributes of nodes.
- **Integrated Dates/Numbers**: Refrain from creating distinct nodes for dates or numbers, attaching them instead as attributes.
- **Format for Properties**: Use a key-value pairing format.
- **Avoiding Quotation Marks**: Do not use escaped quotes within property values.
- **Key Naming**: Adopt camelCase for naming keys, such as `dateTime`.
## Uniformity
- **Entity Uniformity**: Ensure consistent identification for entities across various mentions or references.
## Adherence to Guidelines
Strict adherence to these instructions is mandatory. Non-adherence will result in termination.
3. Graph Construction
- Use CSVLoader and document segmentation to process our documents
- Map the extracted information to graph nodes and relationships
- Process the documents through our extraction pipeline and store the information in Neo4j
- Unfortunately, not all node labels are useful for our context or fit our needs.
{
"identity": 1040,
"labels": [
"Feedbackstatus"
],
"properties": {
"id": "Feedback-Success",
"message": "Sent. Thank you for the feedback!"
},
"elementId": "4:81cd2613-0f18-49c1-8134-761643e88b7a:1040"
},
{
"identity": 1582,
"labels": [
"Feedbackstatus"
],
"properties": {
"id": "Feedbacksuccess",
"status": "Sent. Thank you for the feedback!"
},
"elementId": "4:81cd2613-0f18-49c1-8134-761643e88b7a:1582"
},
{
"identity": 1405,
"labels": [
"Header"
],
"properties": {
"id": "Modalcardhead",
"class": "sgds-modal-card-head"
},
"elementId": "4:81cd2613-0f18-49c1-8134-761643e88b7a:1405"
},
{
"identity": 1112,
"labels": [
"Feedbackindicator"
],
"properties": {
"id": "Feedbacksuccess",
"title": "check",
"message": "Sent. Thank you for the feedback!"
},
"elementId": "4:81cd2613-0f18-49c1-8134-761643e88b7a:1112"
...
4. Evaluation and Refinement
- We’ll specify which node labels the LLM should extract to refine our approach
permissible_nodes_to_extract = [
"Aisubfield",
"Application",
"Cloudservice",
"Concept",
"Digitalsolution",
"Division",
"Entity",
"Feature",
"Fundinginitiative",
"Initiative",
"Link",
"Location",
"Organization",
"Person",
"Platform",
"Policy",
"Program"
"Resource",
"Role",
"Schema",
"Service",
"Standard",
"Technology",
"Technologyplatform",
"Technologystack",
"Webframework",
"Webresource",
"Website"
]
- Explore our newly constructed knowledge graph using Neo4j Browser
- Continuously refine our extraction and graph construction methods
After all the effort, the magic finally begins!
Consider the following query:
Q: “Tell me more about AiSay, who developed this product and its benefits and applications. Also tell me how is AiSay unique.”
Base RAG: “AiSay is an AI-powered document reader developed by the Singapore Government. It can automatically detect, scan, and comprehend various types of documents, including handwritten notes, printed articles, multilingual text, and structured/unstructured documents. Unlike traditional OCR systems, AiSay can provide structured responses for documents such as images and PDF files. It uses OCR, Document Question-Answering techniques, and a Large-Language Model to derive context and answer queries. AiSay is scalable with both synchronous and asynchronous invocation options, allowing for the processing of large file sizes. It is hosted serverless on the cloud, making it scalable in volume and capable of handling large file sizes. AiSay is under active development and supports security and data classification up to RESTRICTED/SENSITIVE NORMAL. Moving forward, more document types will be supported. AiSay’s unique features include its ability to provide structured responses for various document types, its scalability, and its support for security and data classification.”
Graph RAG: “AiSay is an AI-powered document reader developed by the Singapore Government. It is a document parser that can provide structured responses, extract, and integrate information from various documents into existing systems. AiSay can automatically detect, scan, and comprehend a variety of documents, including handwritten notes, printed articles, multilingual text, and structured/unstructured documents. Unlike traditional Optical Character Recognition (OCR) systems, AiSay can provide structured responses for documents such as images (JPEG, PNG, TFIF) and PDF files.
Some key benefits of AiSay include:
1. Reduction of manual errors and improved data accuracy due to auto-filling of key data for form submission from document uploads.
2. Improved digital service experience as key data can be filled in automatically by AiSay once documents are uploaded.
AiSay is unique in its ability to provide structured responses for a wide range of document types, supporting various formats and languages. It also offers a confidence score per task to determine if additional support is required, allowing for minimal human intervention in tasks with high confidence levels. Additionally, AiSay is scalable as it is hosted serverless on the cloud, making it capable of handling large file sizes and extraction processes asynchronously.”
The benefits of Graph RAG, when compared to conventional RAG methods, were evident in this instance.
The response not only demonstrates accuracy but is also enriched with context and relationships, providing a level of depth absent in standard RAG approaches.
Graph RAG’s secret lies in its ability to analyse a user’s query, pinpoint relevant data within a graph database, and incorporate this contextual insight to guide the LLM’s response.
This approach taps into a web of interconnected information that conventional methods might miss, which results to a more nuanced understanding of the query.
Let’s try with a few more examples
Q: “I need to remove sensitive information from some of my documents. What products does Govtech offer that has these capabilities?”
Base RAG: “Govtech offers the products Cloak and FileSG, which have capabilities to help remove sensitive information from documents.”
Graph RAG: “GovTech offers a product called Cloak, which is a central privacy toolkit for policy-compliant data anonymization. This product helps public officers apply data anonymization techniques to datasets and review re-identification risks in compliance with guidelines.”
Q: “How is DIAB different from SIaC?”
Base RAG: “The provided documents do not directly compare Data Infrastructure in a Box (DIAB) with anything named SIaC, nor do they mention SIaC. Therefore, based on the given context, it’s not possible to directly outline differences between DIAB and SIaC. However, we can summarize the key aspects of DIAB as follows:
1. **Purpose and Development**: DIAB is developed by GovTech’s Data Science and Artificial Intelligence Division (DSAID) to serve as an accelerator for data infrastructure deployment and maintenance on the Government on Commercial Cloud (GCC). It is specifically designed to assist government agencies in architecting, deploying, and securing data infrastructure.
2. **Focus on Data Infrastructure**: DIAB emphasizes the construction of consistent and scalable data infrastructure, allowing agencies to save time and reduce overheads when designing and implementing cloud infrastructure. This focus enables agencies to concentrate more on building the applications layer.
3. **Use of Infrastructure-as-Code (IaC)**: DIAB utilizes IaC templates and reference architecture to aid in data engineering considerations, streamlining the process of setting up and managing data infrastructure.
Without information on SIaC, it’s not feasible to draw a comparison. DIAB’s characteristics highlight its role as a tool for government agencies to efficiently deploy and manage data infrastructure in the cloud, leveraging IaC for efficiency and scalability.”
Graph RAG: “DIAB (Data Infrastructure in a Box) and SIaC (Secure Infrastructure-as-Code) are both products aimed at facilitating cloud infrastructure deployment for government agencies. However, they differ in their specific focus and features.
DIAB serves as an accelerator for data infrastructure deployment and maintenance on the Government on Commercial Cloud (GCC). It helps agencies architect, deploy, and secure data infrastructure, focusing on data engineering considerations and providing Infrastructure-as-Code (IaC) templates. DIAB aims to help agencies build consistent and scalable data infrastructure, with key benefits including ease of use, enhanced efficiency, built-in compliance, flexibility in deployment, and future-proofing.
On the other hand, SIaC provides customisable templates for deploying cloud infrastructure in a secure, compliant, and automated manner. It focuses on secure configuration of managed services and tools from Cloud Service Providers, with modular and easily adaptable templates. Key features of SIaC include compliance with Singapore Government standards, modularity for different use cases, integrability with SGTS components, and cloud-native design.
In summary, while both DIAB and SIaC aim to streamline cloud infrastructure deployment for government agencies, DIAB specifically targets data infrastructure deployment with a focus on data engineering considerations and IaC templates, while SIaC focuses on secure and compliant cloud infrastructure deployment with customisable templates and modular design.”
Conclusion
The shift from conventional Retrieval-Augmented Generation (RAG) to Graph RAG represents an important step forward in how we interact with Large Language Models (LLMs)
This transition addresses a fundamental challenge: how to improve LLMs’ ability to provide contextually accurate answers to complex queries.
When comparing the two approaches, the advantages of Graph RAG in handling contextually complex queries become apparent.
Conventional RAG techniques often misses the mark on contextually complex questions.
In contrast, Graph RAG taps into a more complex network of data, delivering responses that capture a more profound comprehension of the query’s subtleties.
However, the effectiveness of Graph RAG is not a One-Size-Fits-All Solution.
It is still highly contingent on the quality, depth, and breadth of the underlying KGs.
In scenarios where the KG is limited or biased towards specific domains, Graph RAG’s performance may not surpass conventional RAG methods.
That said, this transition might hopefully lead to AI systems that better mirror human thought and discovery.