Evaluation of RAG (Retrieval-Augmented Generation) performance (Part 5 of RAG Series)
Quantifying the accuracy and relevance of the RAG output
This is part 5 of the “Retrieval-Augmented Generation (RAG) — Basics to Advanced Series”. Links to other blogs in the series are at the bottom of this blog. Taking forward from part 1 (RAG Basics), part 2 (Chunking), part 3 (Embedding) and part 4 (“Vector Databases and Vector Libraries”). In this blog, we will focus on “Evaluation of RAG”.
Before we proceed into the details of this blog, I would like to conclude the story around the remaining components of the RAG architecture/ framework which we have been referencing in the past blogs. Taking from the last blog, once data from the Vector database is retrieved through the retrieval engine, the retrieved information is sent to the LLM through the RAG interface. LLM uses the retrieved information to “Generate” (note that this is the “Generation” part of the “Retrieval Augmented Generation”) the final output which is sent to the user through the RAG interface. (highlighted in Blue)