How and why you use grounding with your language model

Luis Urena
Google Cloud - Community
6 min readMay 30, 2024

By: Luis Urena and Max Saltonstall

“Tell me about the first elephant on the moon,” may not seem like a serious generative AI model test, but silly questions can be extremely helpful when grounding your model.

Developers have a vested interest in producing more accurate and relevant responses from gen AI models, and sourcing model results from verifiable data can ensure that the model is useful. You can use grounding to constrain the data sources a model draws from for an answer, or to focus on a specific set of records that the model needs to base its responses on. Ultimately, grounding can help reduce the chances of hallucinations.

Grounding is a technique to feed the relevant information from external datasets to the model. This allows the model to access information it doesn’t know, such as recent developments or private, confidential information. With grounding, you can expect less risk of hallucinations, more relevant, latest information generated from the model.

Specific knowledge

In many scenarios, accuracy and relevance is critical. For example, let’s say we’re building a BeyondCorp helper application that can answer questions our developer and security teams have about Zero Trust security. To build the chatbot, we’ve collected years’ worth of whitepapers, beginning with the 2016 publication of BeyondCorp: Design to Deployment at Google. All of the documents are stored in a Vertex AI Agent Builder data store.

This data store represents the source of truth for the bot, and the foundation model is expected to generate results based on the documents in the data store. After all, factuality is critical for this application — otherwise, we may find that the application hallucinates and misinforms the user. Yet when testing the first version of our hypothetical app, it’s clear that some of these responses can’t be in the BeyondCorp dataset: They’re not true at all.

Baseline results

Before we provide any documentation, we want to establish a baseline and determine how well the foundation model generates a response.

Elephants on the moon?

From the above example, it’s clear that the model generated a fabricated response since we’re pretty sure elephants have not yet been to the moon. This is also a problem for our BeyondCorp chat application: The application generated a response that isn’t related to BeyondCorp at all.

So how can we address these hallucinations? Enthusiastic and experienced foundation model developers could build a Retrieval Augmented Generation (RAG) system to ground our application’s responses, but this may take months or even up to a year to finish. Luckily, our application is being built on Vertex AI, which has a built-in feature to ground foundation model responses based on a dataset you provide.

Adding grounding

Let’s attempt to resolve this problem by enabling grounding and defining the source as the Vertex Agent Builder data store containing a single document — “BeyondCorp: Design to Deployment at Google” — so we can assess whether grounding can help generate more accurate responses.

After enabling grounding and supplying the same prompt, the foundation model generates a response that asks us to rephrase the question.

This may be considered an appropriate response — after all, the foundation model did not hallucinate a response about Luna — but further testing is necessary to ensure that grounding is helping us resolve the hallucination issue.

Much better!

Improve response accuracy

How can grounding help if the foundation model receives a prompt that is actually related to BeyondCorp? This is the core of our use case, so it’s great that the application isn’t hallucinating about elephants on the moon. We also need to ensure that it’s generating accurate responses in the subject area we care about. We’ll evaluate grounding’s effectiveness by comparing the foundation model responses with grounding disabled versus with it enabled.

With grounding disabled, the prompt “How did Google build BeyondCorp?” results in a response that seems fairly accurate.

Despite the apparent accuracy, this response seems to be a mix between BeyondCorp principles (device attestation, network segmentation) and general security principles that should be performed even without a BeyondCorp model (multi-factor authentication, application security, user training).

Let’s enable grounding and compare the response:

Now we see more details, including the events that led up to Google developing BeyondCorp and its different components. Not only do we know that Google’s network is segmented, but there is a layered approach to security that includes a perimeter firewall and a set of internal firewall that protects the internal networks from each other, backed by intrusion detection systems to monitor suspicious activity, encryption to obfuscate data, and supplemented with access controls that enforce the principle of least privilege.

Greater detail

Let’s follow-up on that response, but with grounding disabled so we can assess response quality with more complex prompts. For example, the last response mentioned that “BeyondCorp is not without its challenges” and may lead our users to want to learn more. Prompting the foundation model for more details generates the following response:

This isn’t entirely correct. It’s certainly true that BeyondCorp relies on planning and coordination, but it doesn’t “require users to authenticate themselves every time they access a resource”. Rather, BeyondCorp relies on the context of an end-user’s request to ensure each request is authenticated and authorized — without placing undue burden on the customer.

Rather than just enabling grounding this time, we will augment the application’s data store by importing new data. After all, our customers expect the latest information — especially for a question like this — and the data source is still just a single document published in 2016.

Luckily, in 2023 Google released a research paper, “BeyondCorp and the long tail of Zero Trust,” highlighting the most challenging use cases associated with implementing BeyondCorp. After importing that document into our data store and enabling grounding, the application is capable of answering a set of questions that relies on more recent data.

Now this is a response we can trust. We receive additional context around the challenges Google experienced, notably in terms of planning and coordination, without hallucinations about end-users needing to authenticate themselves. Best of all, the response includes citations so we know exactly where its claims originated from.

Do it yourself

To learn more, explore the Integrate Search in Applications using Vertex AI Search lab on Cloud Skills Boost, where you learn how to configure grounding for Google language models in Vertex AI Studio. You can also explore a notebook that can help you create and populate a Vertex AI Agent Builder Data Store, create an app connected to that datastore, and submit queries.

To learn more about grounding, including retrieval augmented generation (RAG) techniques, we recommend you read:

  1. How to use Grounding for your LLMs with text embeddings
  2. RAGs powered by Google Search technology, Part 1
  3. RAGs powered by Google Search technology, Part 2

--

--

Luis Urena
Google Cloud - Community

I'm a Developer Advocate at Google Cloud, helping developers with security technologies