Learnings in Q1 ’24 about Knowledge Graphs and RAG

Chia Jeng Yang
Published in
3 min readApr 28, 2024

At WhyHow.AI, we build tooling for knowledge graphs and structured knowledge representation within your existing RAG pipelines. Here’s a few takeaways about how we have been thinking about RAG in this past quarter.

Why the market for RAG is huge

  1. Most white collar work is information retrieval, since most workflows can be described as retrieving information from someone or somewhere, transforming it in a specific way that reflects the requirements of that job (i.e. marketing vs accounting vs legal), performing some basic reasoning around it, and providing an answer or a report. In fact, a senior manager’s job is mostly context injection, which is inserting the right piece of information at the right place at the right time. There is also a structure to this.
  2. Most information retrieval can be documented as a workflow. Information retrieval is improved by structured knowledge representation. This structure can take the form of how the data is arranged, the steps in which the information is retrieved, and in how to react in edge cases where the process fails
  3. RAG currently faces issues with information retrieval, because of the lack of granular tooling for structured knowledge representation (something Garry Tan of YC points out), but this is where WhyHow.AI and other companies are actively building tools for.
  4. A specific area of interest of mine is thinking about workflow tooling for how the non-technical domain expert and the RAG/Agent developer can intersect to inject context and translate business workflows into RAG systems.

RAG is a process, not a system

  1. Since RAG is an information retrieval workflow, each step of the process is nuanced and reflects the individual nuance and complexity of that workflow. Just as each company and individual may have their own process for performing similar workflows.
  2. The goal is therefore to have more modular tools for granular control and context injection, working alongside more and more intelligent reasoning foundation models.
  3. Any RAG system that claims to work out of the box can only do so for undifferentiated information workflows, which imply that the underlying workflows are simple, or the underlying data is simple, the answers do not need to be perfectly precise or exhaustive, or a tremendous amount of work went into capturing every single edge case related to the workflow (unlikely given the nascent state of AI today). This is not to say certain RAG systems out of the box would not work. Instead, for RAG systems out of the box to work, the specific information retrieval process needs to extremely well-defined. It then behooves the RAG designer to think about how to group and scope in the different types of likely information requests, and workflows, and tackle them in a batch process

Minimum Viable Graphs as the future of GraphRAG

  1. We have written about the concept of small graphs and the ‘minimum viable graph’ in more detail here.
  2. With the advanced development of LLMs, we should view knowledge graphs as tools for sharpening semantic focus, not just as aggregators of data. The resultant small graphs do not need to be fully complete because LLMs bring with it their own understanding of semantics.
  3. This means that in many cases, it is not necessary to create a perfectly described version of the world. The LLM is able to take different pieces of structured data and add their own understanding on top of it.
  4. Small graphs are graphs that are use-case specific, and reflect capturing the types of data that the LLM in your current RAG system fails to be able to navigate in a clean, deterministic way. Check out the different types of graphs one can create here.

WhyHow.AI KG tools currently in closed beta:

WhyHow.AI is building tools to help developers bring more determinism and control to their RAG pipelines using graph structures. If you’re thinking about, in the process of, or have already incorporated knowledge graphs in RAG for accuracy, memory and determinism, we’d love to chat at team@whyhow.ai, or follow our newsletter at WhyHow.AI. Join our discussions about rules, determinism and knowledge graphs in RAG on our newly-created Discord.