(Part 2) The RAG Onion: Layered Analysis of System Failures

Published in

Feedback Intelligence

3 min readJun 27, 2024

(Part 2) The RAG Onion: Layered Analysis of System Failures

In the previous article, we discussed the story of Root Cause Analysis (RCA) development, ie, from traditional software RCA to using ML-enabled RCA methods to RCA for Ai. As we are now in the generative Ai world, let’s see how we can implement RCA for GenAi… aka LLMs.

LLMs can be utilized in any combination of the following:

RAG
Fine-tuning
Prompt engineering

Today, let’s dive into the RAG approach:

🧠 What is a RAG? It’s a technique in natural language processing that combines information retrieval with text generation.

Here’s a brief overview of how it works:

Retrieval: The system searches a large database or knowledge base (context) to find relevant information to a given query or task.
Augmentation: The retrieved information is then used to augment or enhance the input to the language model.
Generation: Finally, the language model generates a response based on both the original query and the retrieved information.

Basically, it’s a way to improve LLM generation by grounding them in external knowledge sources aiming to reduce hallucinations and allow the base LLM to access more up-to-date or specific information.

Now that we know about RAG, let’s talk about where RCA plays a role.

🧅 You can think of RCA for RAG like this…

There are many layers that need to be peeled back. RCA for RAG makes sense of this, it allows Ai teams to identify and resolve issues specific to this hybrid approach that combines large language models with external knowledge retrieval.

This process is as follows:

Identify the issue (e.g., irrelevant retrievals, factual inconsistencies)
Collect data (queries, retrieved documents, model outputs)
Analyze each component: retriever, generator, and their interaction
Investigate discrepancies between retrieved info and generated content
Identify root causes in retrieval or generation steps
Implement and verify solutions

It sounds simple, right? We wish it was… That’s why Feedback Intelligence exists 🚀

The underlying key focus of this process is taking into account the following:

Retrieval accuracy and relevance
Integration of retrieved information into generated responses
Consistency between retrieved facts and model outputs
Query formulation effectiveness
Overall response quality and factual correctness

Even though RAG is relatively new in the Ai world there are common techniques that are important to implement for RCA purposes, ie,

Query analysis and optimization
Retrieval system evaluation
Document relevance assessment
Knowledge integration analysis
Fact-checking against retrieved information

At Feedback Intelligence (FI), we have implemented novel metrics with an orchestration of LLMs and unsupervised learning that help Ai teams find the root cause of the failure of RAG systems.

FI provides actionable insights into negative feedback signals (aka issues), pinpointing what’s wrong in the system, ie, there is a knowledge hole or fusion parameters are not optimal or even re-ranking is the issue.

In the upcoming articles, we will present some examples of RAG systems backpacked with Feedback Intelligence.

See you next time Intelligentsias 😎

Co-author: Haig Douzdjian

Written by movchinar