Build an End-to-End RAG Pipeline with Model Monitoring Pipeline

Stan
7 min readJun 9, 2024

Hong T., Jing W. and Jing Z.

Photo is courtesy of unsplash.com.

Previously, we shared reranker-based RAG pipeline, query augmentation and reranker filtering, RAGAS-based model evaluation, and ground truth dataset generation. In this article, we will demonstrate a use case to put all parts together to build an end-to-end RAG pipeline. It includes both RAG pipelines and model monitoring pipeline (ground truth generation, and LLM metrics generated by RAGAS). The pipeline evaluation is modular and simplified. The next step is MLOps cloud deployment and pipeline monitoring automation. The metrics can be used by project teams to make decisions on model fine-tuning, retraining, or decommissioning.

In the architecture diagram of the pipeline, user queries and documents are injected into the RAG pipeline. A subset of documents can be used to automatically generate a ground truth dataset for model monitoring. In addition, human labels can also be included in the curated ground truth dataset.

In this paper, we utilize metrics provided by RAGAS to compare the performance of a reranker-based RAG pipeline and a regular embedding-based RAG pipeline. Our current use…

--

--

Stan

A director data scientist working in a tech start-up who is passionate about making a positive impact on people around him