Gen AI Grounding with Vertex AI LLM

Dineshbathla
Google Cloud - Community
4 min readMar 6, 2024

--

Grounding Overview

In Generative AI, grounding is the ability to connect LLM output to accurate sources of information. By providing LLM with access to specific data sources, grounding binds their output to specific data and reduces the chances of creating the content of its own. This is particularly important in situations where accuracy and reliability are Important, for instance with financial and health related use cases.

Grounding in Vertex AI lets you use language models to generate content grounded in factuality and use your own data corpus. This capability lets the model access information that goes beyond its training data. By linking to data stores within Vertex AI Search, the grounded model can produce more accurate, factual and relevant answers.

Grounding is available in text-bison and text-unicorn models while writing this blog. It may be released in other models like Gemini in future (subject to change).

Grounding Benefits

  1. Reduces hallucinations, means where the model generates content that is factual.
  2. Model output or responses are based on specific private data or information.
  3. Increase the trustworthiness and applicability of the generated content.

How to enable Grounding in Vertex AI (with example)

There are few steps to be followed for model grounding in Generative AI on Vertex AI, we need to complete these steps one by one. These steps include creating a Vertex AI Search data source, enabling Enterprise edition, and linking the data store to our app in Vertex AI Search. The data source serves as the foundation or source for grounding text-bison or unicorn-bison in Generative AI on Vertex AI.

Now, lets see all the steps with example ,

Step-1

Creating a Vertex AI search datastore — In this step we need to prepare data for ingestion into Vertex AI Search and create a datastore. In vertex AI search, click on datastore, create new datastore , select cloud storage and select Unstructured documents. You can create a source of private data using Unstructured documents like PDFs etc. Check below for reference,

I am using a sample blood report in PDF file format, as can be seen below,

Next, create a Search app and link the datastore we just created . Note down the Datastore id as will be needed in later steps.

Step-2

Enabling Grounding for text-bison model — Grounding is available for the text-bison and text-unicorn models , we will be using text-bison here.

Go to Vertex AI studio, select language from the left side menu , click the text prompt to create new prompt as shown below,

Select text-bison model as can be seen below,

Scroll down a bit, expand advance option, toggle the enable grounding option and click customize as shown below,

Now, from the grounding source dropdown, select Vertex AI Search.Enter the Vertex AI Search data store path to your content. Path should follow this format:

projects/{project_id}/locations/global/collections/default_collection/dataStores/{data_store_id}

Replace project_id with your GCP project id and data_store_id with your datastore which we create in step-1. Below is the screenshot for your reference,

Step-3

Enter the prompt and get responses — now lets try asking few questions regarding blood report and see the result,

Prompt-1 What is the RBC COUNT value in the report

You can see the response, its 5.83 , you can verify from the report screenshot in step-1.

Prompt-2 What is the MCV value in the report.

As you can see in the screenshot above, the response value is 64 which you can verify by looking at the report screenshot in step-1. Moreover, response also includes “Grounding source “ as shown above.

We can also ask generic questions and see the result.

Prompt-3 What is RBC

As you can see, the answer is generic as an LLM response. It’s not from the report, thats why there is no grounding source.

In Summary, you can input any unstructured data files like PDFs as source of Grounding and get factual information or grounded results.

Disclaimer: This is to inform readers that the views, thoughts, and opinions expressed in the text belong solely to the author, and not necessarily to the author’s employer, organization, committee or other group or individual.

--

--