GraphRAG Architecture Overview and User Feedback on Practical Application

6 min readAug 26, 2024

Overview of Last Month’s Hot Open-Source Project from Microsoft: GraphRAG

Last month, Microsoft’s open-source project, GraphRAG, garnered significant attention. At that time, I only briefly glanced at it, opting to wait nearly a month to see if others would deploy and validate it before I dove deeper. This approach seemed efficient, especially given my non-technical background, where experimenting with new technologies often carries a high cost of trial and error.

Now, let’s explore the architecture of GraphRAG and review the feedback from users who have deployed it.

Architecture

GraphRAG aims to address a key limitation of previous RAG (Retrieval-Augmented Generation) tools, which struggled to provide comprehensive answers to global questions that require summarizing entire documents. Traditional RAG tools could only retrieve and answer based on existing document content, failing to summarize relationships or overarching themes.

Even when traditional RAG tools attempted to summarize, the summaries were often generic, generated by the GPT model without a true question-focused approach. Moreover, when uploading documents to LLMs, there’s a risk of hitting the context window limit, leading to information being “lost in the middle” of longer contexts. The LLM would then generate summaries based on its understanding, which might differ in information density and accuracy from what we actually need.

I recently learned about the concept of Chain of Density, which emphasizes solving the issue of information density in LLM-generated summaries. The goal is to extract and summarize the precise information we need, rather than producing broad and vague answers.

“RAG fails on global questions directed at an entire text corpus, such as ‘What are the main themes in the dataset?’ since this is inherently a query-focused summarization (QFS) task, rather than an explicit retrieval task.”

The general approach of GraphRAG involves first generating a knowledge graph from the user-uploaded documents (currently supporting only .txt files) and then using a community detection algorithm to classify instances and relationships. An LLM is then used to summarize entities that have close relationships.

Finally, based on the user’s query, these summaries are further abstracted to answer the question. This method is more suitable for global searches/answers, such as “What are the main themes in the dataset?”

“Our approach uses an LLM to build a graph-based text index in two stages: first to derive an entity knowledge graph from the source documents, then to pre-generate community summaries for all groups of closely-related entities.”

While LLMs are still used to generate summaries, GraphRAG takes it a step further by first organizing information into semantically, hierarchically, and relationally categorized modules, and then summarizing these smaller modules before creating an overall summary. This is a more specific approach compared to directly feeding a document to an LLM and asking for a summary.

Below is a test result from a user abroad. The left side shows a more general response, while the right side, using graph-based search, provides a more precise answer.

The following image is an excerpt from the GraphRAG paper, which I found to be the clearest explanation — far more understandable than secondary interpretations. For these open-source projects, it’s recommended to read the original paper, as it usually provides the most straightforward and accurate explanation, closely aligned with the development team’s intent.

The community detection algorithm in GraphRAG (specifically the Leiden algorithm mentioned in the paper) organizes instances into units based on relationships and hierarchical structures within the knowledge graph (with different colors representing different communities in the graph below). An LLM then pre-summarizes each small community.

When a user query is made, the summaries of relevant communities are organized in parallel and then further summarized to produce a final global answer.

“First using each community summary to answer the query independently and in parallel, then summarizing all relevant partial answers into a final global answer.”

Community Hierarchy refers to the hierarchical community structure formed by entities within a knowledge graph, reflecting the relationships and organization among entities. Typically, it includes:

Top-level communities: Representing the broadest categories or themes (root communities at level 0).
Mid-level communities: More specific sub-categories or sub-themes (sub-communities at level 1).
Bottom-level communities: The most granular classifications, usually directly containing specific entities.

GraphRAG supports auto-prompt generation and can calculate how many tokens the prompt will occupy, ensuring it stays within the token limit. Users can also manually create prompts.

The Local Search method combines structured data from the knowledge graph with unstructured data from the input document during a query, enhancing LLM context with relevant entity information. This approach is particularly suited for answering questions that require understanding specific entities mentioned in the input document (e.g., “What are the therapeutic properties of chamomile?”).

The Question Generation method combines structured data from the knowledge graph with unstructured data in the input document to generate candidate questions related to specific entities, working similarly to local search.

For those interested in further exploring GraphRAG, here are some relevant resources I found:

Current User Feedback

Here’s some feedback from users who have already deployed GraphRAG:

Creating a reliable knowledge graph is inherently challenging, and maintaining and iterating on the graph is resource-intensive. Additionally, the token computation increases, which can challenge the LLM’s response time.
“GraphRAG has a lot of pitfalls. The local search and global search response effectiveness is poor.”
“At least a 70B parameter LLM is needed; smaller models don’t work well, and even then, local models can struggle.”
“GraphRAG currently only supports .txt files, so you need to convert PDFs to .txt first. The knowledge graph is extracted using GraphRAG, not other triple extraction tools. After extraction, it can be visualized using Neo4j.”

After uploading and indexing the .txt file in GraphRAG, it can generate structured files for the knowledge graph, which can be further retrieved using Neo4j and RAG (e.g., Langchain with Neo4j).

Here’s how the code is executed:

Conclusion

These are the main points about the GraphRAG project. In the future, we can expect frameworks that are even more efficient and user-friendly. For now, I’ll continue to watch developments and see if new versions or adaptations make deployment more practical.

Yesterday, a domestic technician released a modified version of GraphRAG that is reportedly more user-friendly. It might be worth checking out.

GraphRAG Architecture Overview and User Feedback on Practical Application

Architecture

Current User Feedback

Conclusion

Written by pamperherself