Vertex AI RAG Engine goes GA : List of Resources
Vertex AI RAG Engine (formerly known as RAG API) is now available in GA (General Availability).
What is Vertex AI RAG Engine
A fully managed service that helps you build and deploy RAG implementations with your data and methods.
You can think of Vertex AI RAG Engine as a service that helps you choose from:
- Models: LLMs in Vertex AI Model Garden, including Google’s Gemini, Llama and Claude. Embedding models (Google) and others from Model Garden.
- Vector Databases: In-built RagManagedDb (default), Vector Search, Vertex AI Feature Store, Pinecone, Weaviate.
- Data Sources: Import files from Google Drive, Google Cloud Storage, Slack, JIRA, Sharepoint.
- File Formats: Google Docs, Slides, Drawings, HTML, JSON, Markdown, PPTX, DOCX, PDF, Text.
- Fine-tuning RAG transformations: chunk_size, chunk_overlap.
RAG Corpus: This is an index, a collection of your documents. Manage this, including file uploads, import, delete via an API.
How to use Vertex AI RAG Engine
Once you have setup the corpus of your documents, the question then is on how to use it.
Couple of interesting ways and this demonstrates how well it connects into the ecosystem:
1. Vertex AI Search as a retriever: Once the corpus is built, simply use the Retriever API in your application.
2. Better still, if you are prompting Gemini, the easiest way to ground the responses is to specify the Vertex AI RAG as a Tool in Gemini.
All of the above features are well captured in the announcement blog post:
Resources
- Official Documentation Page: https://cloud.google.com/vertex-ai/generative-ai/docs/rag-overview
- Quickstart : https://cloud.google.com/vertex-ai/generative-ai/docs/rag-quickstart
- RAG Engine Notebooks from Generative AI Repository: https://github.com/GoogleCloudPlatform/generative-ai/tree/main/gemini/rag-engine
- Sascha Heyer wrote one of the first articles on the RAG API. https://medium.com/google-cloud/google-cloud-rag-api-c7e3c9931b3e
- Mete Atamel wrote an article that helps you understand quickly what it does. https://medium.com/google-cloud/rag-api-powered-by-llamaindex-on-vertex-ai-ead985eab647
- Kamaljeet Singh has written a fantastic deep dive on Product Recommendations using Gemini 2.0. In this, he compares multiple Google Cloud offerings like Langchain on Vertex AI (Which is still in Beta) and Vertex AI RAG Engine. https://medium.com/google-cloud/building-rag-for-product-recommendation-using-google-gemini-2-0-apis-9ecca5089ae2