Building a robust GraphRAG System for a specific use case -Part Three -

kirouane Ayoub
InfinitGraph
Published in
12 min readSep 14, 2024

In the first two parts of this series, we meticulously crafted a custom dataset of question-Cypher pairs and then fine-tuned a Llama 3.1 model to master the art of translating natural language questions into their corresponding Cypher queries. Now, armed with this powerful language model, we’re ready to build the final piece of the puzzle: a fully functional Question-Answering (Q&A) system that can extract insightful answers from our graph database.

In this part, we will explore two prominent frameworks for implementing our Q&A system: LangChain and Llama-index.

With LangChain, we will utilize the GraphCypherQAChain in conjunction with Dynamic Few-Shot Prompting to enhance the accuracy and relevance of the generated Cypher queries.

With Llama-index, we will employ the PropertyGraphIndex, which provides a sophisticated approach for building and querying knowledge graphs using natural language and a variety of retrieval methods.

LangChain’s GraphCypherQAChain with Dynamic Few-Shot Prompting

The GraphCypherQAChain, as its name suggests, is specifically designed for querying graph databases like Neo4j using natural language. It leverages an LLM to translate user questions into Cypher queries, executes these queries against the database, and returns the results as the answer.

To further enhance the LLM’s performance in generating accurate Cypher queries, we will incorporate Dynamic Few-Shot Prompting. This technique involves providing the LLM with a few examples of question-Cypher pairs that are relevant to the user’s current query. By dynamically selecting the most pertinent examples based on semantic similarity, we can guide the LLM to generate more accurate and contextually appropriate Cypher queries, ultimately leading to more precise answers.

Install dependencies

pip install langchain_community neo4j langchain sentence-transformers langchain_openai

Connecting to Neo4j

Here we establish a connection to our Neo4j graph database using the Neo4jGraph class from langchain_community.graphs. We provide the URI of our Neo4j instance, along with the username and password for authentication.

from langchain_community.graphs import Neo4jGraph

NEO4J_URI="neo4j+s://xxxxxxxxxxxxx.databases.neo4j.io"
NEO4J_USERNAME="neo4j"
NEO4J_PASSWORD="xxxxxxxxxxxxxxxxxxx"

graph = Neo4jGraph(url=NEO4J_URI,
username=NEO4J_USERNAME,
password=NEO4J_PASSWORD,
sanitize=True,
enhanced_schema=True)

We also enable sanitization to ensure that the Cypher queries generated are safe to execute and set enhanced_schema to True to retrieve more detailed schema information from the database.

Initializing LLMs

Here we initialize two instances of the ChatOpenAI class, one for generating Cypher queries ( cypher_llm) and another for general question answering ( qa_llm). We specify the OpenAI API endpoint, API key (which can be left empty if using a local model), and the names of the models we want to use.

from langchain_openai import ChatOpenAI

OPENAI_ENDPOINT = "http://127.0.0.1:30000/v1"
OPENAI_API_KEY="EMPTY"

CYPHER_MODEL_NAME = "llama3.1-cypher"
QA_MODEL_NAME = "llama3.1"

cypher_llm = ChatOpenAI(base_url=OPENAI_ENDPOINT,
api_key=OPENAI_API_KEY,
model=CYPHER_MODEL_NAME,)

qa_llm = ChatOpenAI(base_url=OPENAI_ENDPOINT,
api_key=OPENAI_API_KEY,
model=MODEL_NAME,)

In this case, we’re using llama3.1-cypher (Our fine-tuned model) and llama3.1 (for general QA).

Creating and Applying the LLMGraphTransformer

The LLMGraphTransformer leverages the LLM to convert unstructured text documents into structured graph documents by identifying entities and relationships within the text. The text_splitter is used to divide the input text into smaller chunks for more efficient processing.

from langchain_core.documents import Document
from langchain_experimental.graph_transformers import LLMGraphTransformer
from langchain.text_splitter import RecursiveCharacterTextSplitter

llm_transformer = LLMGraphTransformer(llm=qa_llm)
text_splitter = RecursiveCharacterTextSplitter(chunk_size=300, chunk_overlap=50)

text = """
In the city of Novus, a renowned architect named Alice Johnson was busy working on her latest project. Alice had been designing buildings for over 15 years and was well-known for her collaboration with her mentor, Robert Lee, who was also a famous architect. Robert had taught Alice everything she knew, and they remained close friends.
Alice was married to David Johnson, a software engineer who worked at TechCorp. David was passionate about his work and often collaborated with his colleague, Emily Smith, a data scientist at TechCorp. Emily was also Alice’s best friend from college, where they studied together. She frequently visited Alice and David’s home, and they often discussed their work over dinner.
Alice and David had a daughter, Sophie Johnson, who was 8 years old and loved spending time with her grandparents, John and Mary Johnson. John was David’s father, a retired professor, and Mary was a retired nurse. They lived in a neighboring town called Greenville and visited their family in Novus every weekend.
One day, Alice received an invitation from the Novus City Council to present her latest building design. She was excited to showcase her work and immediately contacted Robert Lee to review her plans. Robert was delighted to help, as he had always admired Alice’s talent. Meanwhile, David was busy at TechCorp, where he and Emily were working on a new AI project under the supervision of their manager, Michael Brown.
As the day of the presentation approached, Alice prepared her designs with Robert’s guidance. David and Sophie also attended the event to support Alice. The Novus City Council was impressed with her work and decided to approve the project, marking another success for Alice. After the event, the family celebrated with a dinner at their favorite restaurant, The Green Olive, where they were joined by Emily and Robert.
"""

documents = text_splitter.create_documents([text])
graph_documents = llm_transformer.convert_to_graph_documents(documents)
print(f"Nodes:" , [graph_documents[i].nodes for i in range(len(graph_documents))])
print(f"Relationships: " ,[ graph_documents[i].relationships for i in range(len(graph_documents))] )

We then apply the LLMGraphTransformer to a sample text document, extracting nodes and relationships and printing them.

Defining Allowed Nodes, Relationships, and Properties

We can specify the allowed node types, relationship types, and node properties that the LLM should consider during extraction. This allows us to customize the graph construction process based on our specific needs and the structure of our target knowledge graph.

llm_transformer_props = LLMGraphTransformer(
llm=llm,
allowed_nodes=["Person", "City", "Organization"],
allowed_relationships=["NATIONALITY", "LOCATED_IN", "WORKED_AT", "SPOUSE"],
node_properties=["born_year"],
)

graph_documents_props = llm_transformer_props.convert_to_graph_documents(documents)
print(f"Nodes:" , [graph_documents[i].nodes for i in range(len(graph_documents))])
print(f"Relationships: " ,[ graph_documents[i].relationships for i in range(len(graph_documents))] )

Storing the Graph Documents in Neo4j

Here we store the generated graph documents into our Neo4j database using the add_graph_documents method of the graph object.

graph.add_graph_documents(graph_documents_props)

Building the Q&A System

Hand-Crafted Examples

Here we define a list of hand-crafted examples that demonstrate the desired behavior of our Q&A system. Each example consists of a natural language question and its corresponding Cypher query.

Example :

examples= [
{
"question": "Which workers live in Canada and speak German?",
"query": "MATCH (p:Person)-[:LIVES_IN]->(:Country {{name: 'Canada'}}), (p)-[:SPEAKS]->(:Language {{name: 'German'}}) RETURN p.name",
},
{
"question": "In which countries do workers who speak Spanish live?",
"query": "MATCH (p:Person)-[:SPEAKS]->(:Language {{name: 'Spanish'}})<-[:SPEAKS]-(worker:Person)-[:LIVES_IN]->(c:Country) RETURN DISTINCT c.name AS Country",
},

{
"question": "What companies do workers named John work in?",
"query": "MATCH (p:Person {{name: 'John'}})-[:WORKS_IN]->(c:Company) RETURN c.name",
},
{
"question":"How many workers in Hospital and Health Care industry able to speak Korea",
"query": "MATCH (p:Person)-[:WORKS_IN]->(:Company)-[:IS_IN]->(:Industry {{name: 'Hospitals and Health Care'}}),(p)-[:SPEAKS]->(:Language {{name: 'Korean'}}) RETURN COUNT(DISTINCT p) AS NumberOfWorkers",
},
{
"question": "Give me list of all companies are in the software development industry?",
"query": "MATCH (c:Company)-[:IS_IN]->(:Industry {{name: 'Software Development'}}) RETURN c.name",
},
{
"question": "Where do workers named Alice live?",
"query": "MATCH (p:Person {{name: 'Alice'}})-[:LIVES_IN]->(c:Country) RETURN c.name",
},
]

These examples will be used for dynamic few-shot prompting to guide the LLM in generating accurate Cypher queries for similar user questions.

Creating the Semantic Similarity Example Selector

Here we create a SemanticSimilarityExampleSelector using the hand-crafted examples, Hugging Face embeddings, and a Neo4j vector store. The example_selector will select the most relevant examples from the provided list based on the semantic similarity between the user's question and the example questions.

from langchain_community.vectorstores import Neo4jVector
from langchain_core.example_selectors import SemanticSimilarityExampleSelector
from langchain_community.embeddings import HuggingFaceEmbeddings

example_selector = SemanticSimilarityExampleSelector.from_examples(
examples,
HuggingFaceEmbeddings(),
Neo4jVector,
url = NEO4J_URI,
username = NEO4J_USERNAME,
password = NEO4J_PASSWORD,
k=3,
input_keys=["question"],
)

This selection is done using cosine similarity between the embeddings of the questions.

Testing the Example Selector

We provide a placeholder question and the select_examples method returns the most similar examples from the hand-crafted list.

example_selector.select_examples({"question": "Write your question here .. "})

Creating the Dynamic Prompt Template

We specify the example_selector, example_prompt (which defines how each example is formatted in the prompt), prefix (which provides general instructions to the LLM), suffix (which formats the user's question in the prompt), and input_variables.

from langchain_core.prompts import FewShotPromptTemplate, PromptTemplate

example_prompt = PromptTemplate.from_template(
"User input: {question}\nCypher query: {query}"
)

dynamic_prompt = FewShotPromptTemplate(
example_selector=example_selector,
example_prompt=example_prompt,
prefix="You are a Neo4j expert. Given an input question, create a syntactically correct Cypher query to run.\n\nHere is the schema information\n{schema}.\n\nBelow are a number of examples of questions and their corresponding Cypher queries. Don't add any preambles, just return the correct cypher query",
suffix="User input: {question}\nCypher query: ",
input_variables=["question", "schema"],
)

This dynamic prompt template will incorporate the most relevant examples into the prompt based on the user’s question, guiding the LLM to generate more accurate Cypher queries.

Creating the GraphCypherQAChain

Here we create the GraphCypherQAChain using the from_llm method. We provide the graph object, the LLMs for Cypher generation and question answering, the dynamic prompt template, and other configuration options.

from langchain.chains import GraphCypherQAChain

chain_with_dynamic_few_shot = GraphCypherQAChain.from_llm(graph=graph,
cypher_llm=cypher_llm,
qa_llm=qa_llm,
cypher_prompt=dynamic_prompt,
verbose=True,
validate_cypher = True,
use_function_response=True)

We enable verbose mode for debugging, validate the generated Cypher queries before execution, and use function response to return structured results.

Testing the Q&A System

We provide a placeholder question and the invoke method returns the result, which will contain the answer retrieved from the graph database based on the generated and executed Cypher query.

question = "write your question here .. "
result = chain_with_dynamic_few_shot.invoke(question)['result']
print(result)

Llama-Index’s PropertyGraphIndex

The PropertyGraphIndex in Llama-index offers a powerful and flexible way to build and interact with knowledge graphs. It leverages labeled property graphs, allowing for a richer representation of data compared to traditional knowledge graphs. This means we can assign labels and properties to both nodes and relationships, capturing more nuanced information about the entities and their connections.

The PropertyGraphIndex supports hybrid search capabilities, combining symbolic and vector-based retrieval methods. This enables us to query the graph using keywords, vector similarity, or even complex Cypher queries, depending on the nature of the question and the desired level of precision.

By integrating with various storage options, including Neo4j, the PropertyGraphIndex offers adaptability to different data management needs. It also supports various graph extraction methods, providing fine-grained control over how the knowledge graph is constructed from different sources.

This versatility makes the PropertyGraphIndex an excellent choice for building dynamic and expressive knowledge graphs that can be easily queried and expanded, especially in applications involving LLMs and natural language interactions.

Installing Dependencies

pip install llama-index llama-index-graph-stores-neo4j

Importing Modules

from llama_index.core import SimpleDirectoryReader
from llama_index.core import PropertyGraphIndex
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.llms.openai import OpenAI
from llama_index.core.indices.property_graph import SchemaLLMPathExtractor

from llama_index.graph_stores.neo4j import Neo4jPropertyGraphStore

Connecting to Neo4j

Here we establish a connection to our Neo4j graph database using the Neo4jPropertyGraphStore class.

NEO4J_URI="neo4j+s://xxxxxxxxxxxxx.databases.neo4j.io"
NEO4J_USERNAME="neo4j"
NEO4J_PASSWORD="xxxxxxxxxxxxxxxxxxx"

graph_store = Neo4jPropertyGraphStore(
username=NEO4J_USERNAME,
password=NEO4J_PASSWORD,
url=NEO4J_URI,
)

We provide the username, password, and URI of our Neo4j instance. This store will be used to persist and interact with the knowledge graph.

Initializing LLM and Embedding Model

Pull bge-large model first :

ollama pull bge-large:latest

Here we initialize the OpenAI LLM and embedding model. We specify the API keys, base URLs for both the LLM and embedding endpoints, and the model names.

LLM_OPENAI_ENDPOINT = "http://127.0.0.1:11434/v1"
EMBEDDING_OPENAI_ENDPOINT = "http://127.0.0.1:11434/api/embeddings" # or you can use http://127.0.0.1:11434/v1

OPENAI_API_KEY="EMPTY"

LLM_MODEL_NAME = "llama3.1-cypher"
EMBEDDING_MODEL_NAME = "bge-large:latest"

llm = OpenAI(api_key=OPENAI_API_KEY ,
api_base=OPENAI_ENDPOINT ,
model=LLM_MODEL_NAME )

embedding_model = OpenAIEmbedding(model_name=EMBEDDING_MODEL_NAME ,
api_key=OPENAI_API_KEY ,
api_base=EMBEDDING_OPENAI_ENDPOINT )

In this case, we’re using llama3.1-cypher for the LLM and bge-large for the embedding model.

Defining the Knowledge Graph Schema and Extractor

Here we define the schema for our knowledge graph, specifying the allowed entity types ( entities) and relationship types ( relations). We also define a validation schema ( validation_schema) that dictates which relationship types are valid for each entity type.

from typing import Literal
# best practice to use upper-case
entities = Literal["PERSON", "PLACE", "ORGANIZATION"]
relations = Literal["HAS", "PART_OF", "WORKED_ON", "WORKED_WITH", "WORKED_AT"]

# define which entities can have which relations
validation_schema = {
"PERSON": ["HAS", "PART_OF", "WORKED_ON", "WORKED_WITH", "WORKED_AT"],
"PLACE": ["HAS", "PART_OF", "WORKED_AT"],
"ORGANIZATION": ["HAS", "PART_OF", "WORKED_WITH"],
}

kg_extractor = SchemaLLMPathExtractor(
llm=llm,
possible_entities=entities,
possible_relations=relations,
kg_validation_schema=validation_schema,
# if false, allows for values outside of the schema
# useful for using the schema as a suggestion
strict=True,
)

We then create a SchemaLLMPathExtractor using the LLM, entities, relations, and validation schema. This extractor will use the LLM to extract knowledge from documents and structure it into a knowledge graph according to the defined schema. We set strict to True to enforce the schema during extraction.

Here we load documents from a directory named “data” using the SimpleDirectoryReader.

documents = SimpleDirectoryReader("data").load_data()

These documents will be used to build the knowledge graph.

Creating the PropertyGraphIndex

We specify the knowledge graph extractor, embedding model, property graph store (Neo4j), and a simple vector store.

from llama_index.core import PropertyGraphIndex
from llama_index.core.vector_stores.simple import SimpleVectorStore

index = PropertyGraphIndex.from_documents(
documents,
kg_extractors=[kg_extractor],
embed_model=embedding_model,
property_graph_store=graph_store,
vector_store=SimpleVectorStore(),
show_progress=True,
)

We also enable progress display during index construction.

Retrieving Nodes

Here we use the as_retriever method of the index to retrieve relevant nodes based on a query.

retriever = index.as_retriever(
include_text=True, # include source chunk with matching paths
similarity_top_k=2, # top k for vector kg node retrieval
)
nodes = retriever.retrieve("write your question here .. ")
print(nodes)

We include the source text chunks and set similarity_top_k to retrieve the top 2 most similar nodes based on vector similarity.

Querying the Index

Here we use the as_query_engine method to create a query engine for the index. We include the source text and set similarity_top_k to 2.

query_engine = index.as_query_engine(
include_text=True,
similarity_top_k=2,
)
response = query_engine.query("write your question here .. ")
print(response)

We then use the query engine to answer a question and print the response.

Hybrid Search (Property graph + Similarity Search)

from llama_index.core.indices.property_graph import (
LLMSynonymRetriever,
VectorContextRetriever,
)
llm_synonym = LLMSynonymRetriever(
index.property_graph_store,
llm=llm,
include_text=False,
)

The LLMSynonymRetriever generates synonyms for the query terms using the LLM and retrieves nodes based on these synonyms.

vector_context = VectorContextRetriever(
index.property_graph_store,
embed_model=embedding_model,
include_text=False,
)

The VectorContextRetriever retrieves nodes based on vector similarity between the query and the node embeddings.

Creating a Query Engine with Sub-Retrievers

Here we create a query engine that utilizes the two retrievers we created earlier as sub-retrievers.

query_engine = index.as_query_engine(
sub_retrievers=[
llm_synonym,
vector_context,
],
llm=llm,
)

This allows the query engine to leverage both synonym-based and vector-based retrieval methods to find relevant information in the knowledge graph.

Querying the Index with the Enhanced Query Engine

Here we use the enhanced query engine (with sub-retrievers) to answer a question and print the response.

response = query_engine.query("Write your question here .. ")

print(str(response))

This demonstrates how to combine different retrieval methods to improve the accuracy and relevance of the answers.

Saving and Loading the Index

Here we demonstrate how to persist the index to disk and load it back from storage. We use the persist method of the storage context to save the index to a directory named "storage". We then use load_index_from_storage to load the index back from the same directory. We also show how to load an index from an existing graph store and optionally a vector store.

# save and load
index.storage_context.persist(persist_dir="./storage")

from llama_index.core import StorageContext, load_index_from_storage

index = load_index_from_storage(
StorageContext.from_defaults(persist_dir="./storage")
)

# loading from existing graph store (and optional vector store)
# load from existing graph/vector store
index = PropertyGraphIndex.from_existing(
property_graph_store=graph_store
)

Create UI using Mesop

To make our GraphRAG system more user-friendly, we’ll build a simple chat interface using Mesop, a framework for creating web apps with Python.

Installing Mesop

First, we need to install the Mesop library:

pip install mesop

Building the Chat Interface

Now, let’s create a basic chat interface using Mesop:

Open app.py :

import mesop as me
import mesop.labs as mel
from main import query_engine # import query_engine

@me.page(
security_policy=me.SecurityPolicy(
allowed_iframe_parents=["https://google.github.io"]
),
path="/chat",
title="GraphRAG Demo Chat",
)
def page():
mel.chat(transform, title="GraphRAG", bot_user="GraphRAG bot")


def transform(input_text: str, history: list[mel.ChatMessage]):
response = query_engine.query(input_text)
yield str(response)

This code creates a basic chat interface where users can type their questions and receive answers from the GraphRAG system. You can further customize the interface using Mesop’s components and styling options.

To Run the UI app use this command :

mesop app.py

And that’s a wrap ..

We’ve built a question-answering system that lets you ask your graph database questions in plain English. pretty cool, right?

This concludes our three-part GraphRAG series, and it’s just the beginning of my exploration into this exciting field. Expect to see more GraphRAG content from me, diving into new techniques and applications. Stay tuned .

Happy prompting .

--

--

kirouane Ayoub
InfinitGraph

I Like building Machine Learning models from scratch .