Designing Data Store Hybrid Agents with Dialogflow CX & Vertex AI Agents

Gabriele Randelli

Published in

Google Cloud - Community

16 min readJun 28, 2024

Introduction

In the first article of this series we’ve designed a purely intent-based agent with deterministic flows.

Deterministic agents are effective when a repeatable and auditable process is mandatory (e.g. think of an offer subscription). Companies have heavily invested in the past years in these agents, hence they’re in favor of re-using, at least partially, their current assets.

On the other hand, generative AI unleashes new opportunities for customer service, such as the possibility to create more natural and engaging conversations, answering to a broader range of topics without the need to explicitly structure them as intent-based flows. Such an approach is particularly appreciated in conversational commerce scenarios.

Dialogflow CX offers the possibility to take the best out of both the paradigms, combining intent-based and generative flows with built-in generative features: agent apps, generative fallbacks, generators and integration with data store agents. Within the scope of this article, we are going to address data store agents and generative fallbacks.

In particular, our goal is to:

Extend our car rental chatbot designed in the first article, mixing deterministic and generative flows by integrating data stores, thus designing a data store hybrid agent;
Enable generative fallback, to gently handle those situations where the user’s intent is not properly matched;
Provide some best practices on hybrid agents, mainly based on empirical experiences.

New expectations from conversational agents

The agent presented in our first article is a car rental bot conceived to handle car booking requests. This has been implemented with an intent-based paradigm. The main ingredients of a pure deterministic approach are: each flow is a state diagram where the agent orderly transitions from one page to another by detecting intents, collecting parameters and fulfilling the customer’s requests.

Let’s assume we would also like to support our beloved customers on generic inquiries, such as: what’s a luxury car, what manufacturers are available, what’s the maximum speed of a specific car model, and so on. These questions shift the role of agents from post-sales assistance to presales, a key activity in conversational commerce experiences.

We can recap these new emerging requirements as follows:

Questions like these may happen at any time, even in the middle of the booking process — The agent should intercept these conversation digressions, eventually reverting to the current page, without restarting the deterministic process from scratch every time;
Open questions pose a maintainability challenge in terms of data volumes and variety. Codifying each possible question with an intent-based approach is almost impossible and we need a more scalable approach;
Generating answers is fine, but controlling the final output is even more important — above all, we want to ground the answer to our company knowledge base, not to the LLM knowledge.

A purely deterministic approach cannot fulfill these needs with low effort. We need to shift our paradigm.

The Role of Data Stores

Generative AI fits in very well to address the aforementioned new requirements. In particular, the scope of this article is mainly focused on integrating data stores agents. For the sake of terminology, take into account that data store agents are also known as chat apps and at the time of this article they are part of Vertex AI Agents, Google Cloud brand new natural language understanding (NLU) platform built on LLMs to create agents. We’ll overview this platform in the third article of this series.

Data stores are collections of indexed websites and documents, each of which reference your enterprise data. Three are the main ingredients when adopting data stores for AI agents:

Grounding the customers’ answers in your enterprise data;
Generating a coherent answer with the underlying LLM’s capabilities;
Considering the past conversation between the user and the agent.

Data stores in Google Cloud Platform can be classified according to the following dimensions:

Data source: websites, BigQuery, Cloud Storage, 3rd party systems via connectors;
Data format: structured (e.g. a table or a FAQ) vs unstructured data (e.g. a PDF file);
Metadata availability
Configuration: parsing, chunking, and so on

It’s important to highlight a few aspects when leveraging the built-in integration provided by Dialogflow CX:

Do not confuse agent data stores, addressed in this article, with search data stores;
The only structured data store type supported is FAQ;
Apps with both chunked and non-chunked data stores are not supported.

In particular, this latter is important in our scenario. Since we’re going to use both an unstructured and structured data store and chunking is only supported for unstructured data stores, we need to disable chunking for both. Whenever any of the current limitations affect the accuracy of your implementation, my advice is to bypass the built-in integration and to directly call a search data store via the Vertex AI Search API.

Hybrid Agent Main Concepts

Design

We are going to extend our car rental bot by adding three new components:

A structured data store, based on a FAQ CSV file, for specific questions already addressed and stored in your company knowledge base;
An unstructured data store with metadata, based on a set of PDF files, for more open questions;
Last, we are enabling generative fallback across the agent to handle scenarios with no-match intent detection.

All these features will be activated in the start page of the default start flow, hence they will be available across the whole agent, to intercept these questions in any page of every deterministic conversation flow.

Dialogflow CX evaluates the end-user input in the following order of preference:

Intent match for routes in scope;
FAQ data store content;
Unstructured data store content.

This evaluation order always prioritizes the intent-based fulfillment, when available, since this is the most deterministic. Again, we are not taking down our existing agent; Generative AI is extending the current implementation.

Architecture

The architecture is almost the same of the predictive bot:

The chat widget embedded in your site is based on Dialogflow Messenger;
The agent is implemented using Dialogflow CX;
The agent data stores are created with Vertex AI Agent Builder;
The existing Google Cloud Storage bucket also contains the raw data indexed by the data stores.

Below is a comprehensive architecture diagram. Both the deterministic and generative flows have been reported. However, feel free to ignore some of the components (webhook, cloud function, Gemini API), since they’re related to the agent multimodal capabilities presented in the first article of this series.

Best Practices

Before moving to the actual implementation, which is in scope for the next section, it’s worth pinpointing some best practices with data stores, to take the best out of this technology.

The following high-level considerations for data store agents are only based on my empirical experience in former projects:

A single data store is rarely the solution to all your needs — In most of the scenarios, you need a finer grain knowledge decomposition, with different data stores serving different sub-domains;
Intents can disambiguate the knowledge subdomain and redirect the user to the right data store — I know the best would be to bind a data store at agent level to solve all your needs, but I’m generally more in favor of using intents to disambiguate the user’s real intent;
If you’re looking for full control and repeatability, go for pure intent-based agents. On the other hand, if you’re looking for information not present in your company’s knowledge base, turn to a webhook directly invoking the LLM. Data stores sit in the middle: when you need to address open questions, but grounded on your own data.

Turning to our specific scenario, why did we choose to use a structured data store?

Whenever such a knowledge base is available in a company, retrieving answers from FAQs is generally more accurate and less prone to LLM hallucinations than unstructured data. Again, when modeling an agent, my suggestion is to always analyze whether you already own this kind of knowledge in your enterprise systems, or if you can implement an easy process to acquire it.

You may argue that most of the FAQ questions can be addressed with an intent-based approach, and it’s true. However, modeling intents is more time consuming than using a FAQ and the nature of these questions is not so critical to require a fully deterministic approach (e.g. a car damage meanwhile driving is a critical problem and a deterministic process would definitely be the best approach).

Turning to the unstructured data store, can we answer these questions with the LLM’s knowledge? Again, the rationale is to gain more control. By using our own enterprise data, we track which info we want to expose and tailor it to our needs. Moreover, the LLM’s knowledge might be outdated, while we can always provide fresh content.

You can find additional best practices on Google Cloud’s documentation or by running a self-service evaluation script.

Pimp My Deterministic Agent!

Convert Your Agent

Agent data stores, also known as chat apps, are formally part of our new platform Vertex AI Agents. Dialogflow CX can integrate with them through a built-in feature. However, the very first time we add a data store to our legacy Dialogflow CX agent, we need to convert the agent app. Bear in mind that the flow is slightly different if you directly create the data store agent from the Vertex AI Agent Builder console.

Let’s start from creating a new Dialogflow CX agent and restore the car rental agent defined in our first article an exported version of this agent is available on GitHub).

Next, expand Start Page, select Add State Handler and tick Data Stores to visualize the corresponding group in the page.

In the new Data Store group click on + and you’ll notice on the right-side panel the following message:

Click on Create Vertex AI Search and Conversation app (names may change in the future) and you’ll be redirected to the Vertex AI Agent Builder console. Accept the terms, assign a Company Name and continue.

We now move to the data store creation section, which will be addressed in the next two sections.

Structured Data Store

In our example, the structured data store is an FAQ stored in Google Cloud Storage as a CSV file, without any metadata.

Let’s put in practice what we have so far described. Copy the snippet below, reporting some basic user inquiries, in a CSV file named car_rental_faq.csv and upload it to a GCS bucket, within a sub-folder car-rental-faq.

question, answer
What kind of economy car do you have?, Economy cars are Mitsubishi Mirage and Nissan Versa.
What kind of luxury car do you have?, Luxury cars are Chevy Tahoe and Dodge Charger.
What kind of cars do you have?, Economy cars are Mitsubishi Mirage and Nissan Versa. Luxury cars are Chevy Tahoe and Dodge Charger
What number can I call to get info?, +1 123 345 6789

Go back to the Vertex AI Agent Builder UI and create the first data store.

Click on Cloud Storage and select the last radio button Structured FAQ data. Click on File and select on the right side panel the uploaded file in the GCS bucket.

Assign a name to the datastore and create it.

Now you have to wait a few minutes to let the data store propagate to Dialogflow CX (refresh the UI page to see the applied changes).

Proceed with the following steps:

Go back to the Dialogflow CX UI and double check that the just created data store is visible and selected in the FAQ documents combo;
Save the page and close the side panel.

Unstructured Data Store with Metadata

The unstructured datastore is composed of a set of PDF files, sort of booklets for each car. The main purpose is to provide generic information about each model, to fulfill open questions (e.g. what’s the fastest car you have?).

Below is reported a small snippet of these PDF files:

The Chevrolet Tahoe is a full-size SUV known for its spacious interior, powerful engine options, and impressive towing capabilities. Here's a breakdown of its main technical details:
Engine Options:
● 5.3L V8 engine: This is the standard engine, delivering 355 horsepower and 383 lb-ft of torque. It offers a good balance of power and fuel efficiency.
● 6.2L V8 engine: Available on higher trim levels, this engine boasts 420 horsepower and 460 lb-ft of torque, providing exceptional acceleration and towing capacity.
● 3.0L Duramax Turbo-Diesel engine: For those prioritizing fuel economy, this diesel engine offers 277 horsepower and 460 lb-ft of torque while delivering impressive fuel efficiency.
Transmission: All Tahoe models come equipped with a smooth-shifting 10-speed automatic transmission, ensuring optimal performance and efficiency.
Drivetrain:
● Rear-wheel drive (RWD): Standard on LS, LT, RST, and Z71 trims.
● Four-wheel drive (4WD): Available on all trims and standard on Premier and High Country. Offers enhanced traction and control in various driving conditions.

You can find some sample files in my GitHub repository. Alternatively, feel free to create your own documentation. Upload the files to the same GCS bucket, this time within a sub-folder car-rental-kb/files and ensure no other file is in this folder.

Since we are going to use metadata, we need to create a JSON Line file, each line describing the document to import associated to its URI on Cloud Storage. You must follow the format described in the Google Cloud documentation.

In our case, we are going to use some metadata later on, and that’s why I’ve explicitly added an additional field car-type. Our JSONL file is below reported.

{ "id": "d001", "content": {"mimeType": "application/pdf", "uri": "gs://car-rental-bkt/car-rental-kb/files/chevrolet.pdf"}, "structData": {"title": "Chevrolet Tahoe", "car-type": "Luxury"} }
{ "id": "d002", "content": {"mimeType": "application/pdf", "uri": "gs://car-rental-bkt/car-rental-kb/files/dodge.pdf"}, "structData": {"title": "Dodge Charger", "car-type": "Luxury"} }
{ "id": "d003", "content": {"mimeType": "application/pdf", "uri": "gs://car-rental-bkt/car-rental-kb/files/mitsubishi.pdf"}, "structData": {"title": "Mitsubishi Mirage", "car-type": "Economy"} }
{ "id": "d004", "content": {"mimeType": "application/pdf", "uri": "gs://car-rental-bkt/car-rental-kb/files/nissan.pdf"}, "structData": {"title": "Nissan Versa", "car-type": "Economy"} }

Last, store this JSONL file in the car-rental-kb folder. The final result should be similar to the picture below.

To create the unstructured data store we need to repeat the process of the previous section, but this time the Dialogflow CX agent has been already converted:

Enter the Dialogflow CX bot UI, go to Default Start Flow, select Start Page then Edit Data Stores;
Select on the right side panel the combo Unstructured documents and click on + Create data store. You’ll be redirected to Create Data Store widget within Vertex AI Agent Builder in the GCP Console;
Click on Cloud Storage;
Select the last radio button Linked unstructured documents, select File and select on the right side panel the JSONL file;

Assign a name to the datastore, select Digital Parser and ensure that the Enable advanced chunking configuration checkbox is disabled, since we cannot use chunking;

Wait a few minutes, go back to the Dialogflow CX UI and double check that the just created data store has been selected in the Unstructured documents combo;
Save the page and close the side panel.

To recap, what we’ve so far accomplished is:

Restoring our previous legacy Dialogflow CX agent;
Converting this agent to a Vertex AI Agents chat app, to integrate with data stores;
Adding a first structured data store when converting the agent;
Adding a second unstructured data store with metadata.

Generative Fallback

Despite intents and data stores, no-math conditions may still happen. Imagine a scenario where the user asks something significantly out of context. Here is where generative fallback comes in action.

We can configure a text prompt to instruct the LLM how to respond. Dialogflow CX automatically provides a default prompt and can handle basic scenarios. Otherwise, you can provide your own prompt and use different placeholders to get conversation data. If the response generation is unsuccessful, Dialogflow CX will turn to the old static fulfillment text, if any.

To properly use generative fallback, a best practice is to properly describe flows and intents, since this is contextual info the generative fallback will leverage.

Again, since we want to activate this feature across the whole agent:

let’s go to Default Start Flow and select Start Page;
Under Event Handler click on View all and on the right side panel select No-match 1;

Under the agent responses section enable the generative fallback feature.

It’s worth highlighting that below the checkbox there’s also the static fulfillment, to be triggered upon generation failure.

If you want to change the prompt passed to the LLM, you need to act through the agent settings, as below reported.

Generative Settings

Let’s review the basic settings of our brand new data store agent. You can access them from Agent Settings on the right side of the Dialogflow CX UI, then select the Generative AI tab.

Go to the Generative agent section and double check that gemini-1.0-pro-001 is the selected model. Or, feel free to choose a newer one, whether available.

Go to the Data store section and start by enabling grounding and setting the confidence score to medium. This enforces the confidence that all the information in the response is supported by information in the data stores. Tick also the apply grounding heuristics filter to suppress likely inaccurate and hallucinated responses. As you can see, there are different ways to control the generative flow and guarantee high quality responses.

You can also add contextual information about the bot’s purpose, which can further improve its quality. See the below snippet for example (whatever your agent language, pass this information in English).

Next, select the default summarization prompt and the summarization model (at the time of this article gemini-1.5-flash-001 is available). If you want to customize the summarization prompt, follow Google Cloud’s documentation and respect all the hints there provided not to lower the overall quality. If unsure, just stay with the default prompt as in this case.

Last, select both the Fallback link and Enable Generative AI checkboxes. The former allows, when the generated response is suppressed, at least to return the first retrieved data link from the data store.

Dialogflow Messenger

The web application conceived in the first article can serve our new agent without any change but the new agent ID, thereby I won’t spend too much time on how to embed the chat widget in your HTML page.

Nonetheless, there are a couple of potentially useful features in Dialogflow Messenger, which can be useful with data store agents.

The first one is personalization, that is, passing relevant information about the end user to the LLM generating the answer. This information must be passed as JSON format. No specific schema is required and no transformation will be applied under the hood.

Dialogflow Messenger supports personalization through the setContext method. Let’s modify our previous implementation by adding the following snippet within the script tag:

// search personalization metadata
const metadata = {
  "gps-detected-location": "NYC",
  "preferred-vehicle-category": "luxury",
  "devices owned": [
    {
      model: "Google Pixel 8",
    },
    {
      model: "Google Pixel Tablet",
    },
  ],
};
dfMessenger.setContext(metadata);

In this example we are passing information about the user device, position and car preferences. This latter is typically retrieved from some backend system (e.g. a CRM). If so, the best is to fetch this information with a webhook on the Dialogflow CX agent side.

The second feature is related to boost and filter search configuration, used to boost, bury and filter documents. In particular, in this article we are introducing boost controls to change search result ranking by applying a boost value (greater than zero for higher ranking, less than zero for lower ranking) to specific documents. In this case a specific JSON format is required:

"searchConfig": {
  "boostSpecs": [
    {
      "dataStores": ["DATASTORE_ID"],
      "spec": [
        {
          "conditionBoostSpecs": {
            "condition": "CONDITION",
            "boost": "1.0"
          }
        }
      ]
    }
  ]
}

DATASTORE_ID is the full name of the data store in the format projects/your_project_id/locations/your_location/collections/your_collection_name/dataStores/your_datastore_name and can be retrieved from the Vertex AI Agent Builder UI, entering the Data Stores section.

CONDITION must follow the Vertex AI Agent Builder’s filter expression syntax and the boost value is a number between -1.0 (for lower ranking) and 1.0 (for higher ranking).

Again, Dialogflow Messenger supports search configuration through the method setQueryParameters and we can add a new listener within the script tag in our HTML page:

In this example, our condition is boosting data sources related to luxury cars, and that’s why we adopted an unstructured data store with metadata, to explicitly associate each document to a car category (economy or luxury).

dfMessenger.addEventListener('df-messenger-loaded', function (event) {
  const searchConfig = {
    "boostSpecs": [
      {
        "dataStores": [ "projects/your-project-id/locations/your-datastore-region/collections/default_collection/data-stores/your-datastore-id" ],
        "spec": [
          {
            "conditionBoostSpecs": {
              "condition": "car-type: ANY(\"Luxury\")",
              "boost": "0.7"
            }
          }
        ]
      }
    ]
  }
dfMessenger.setQueryParameters(searchConfig);
});

Summary

In this article we explored the possibility to create a data store hybrid agent by expanding a legacy deterministic intent-based Dialogflow CX implementation with new Generative AI capabilities. In particular, we have introduced three new built-in Generative AI components: structured data stores, unstructured data stores with metadata and generative fallbacks.

Our main goal was to showcase that generative AI can greatly help in fulfilling open questions grounded to your data, without the need to explicitly model each topic with an intent. This is a scalable approach that allows to position AI agents more into conversational commerce use cases. Furthermore, you don’t need to restart from scratch. Your existing Dialogflow CX agents can be integrated with the new Vertex AI Agents features.

Below you can find additional resources, and a basic implementation of this article published on my GitHub repository. I strongly encourage you to deploy the code and further customize it according to your use case.

Stay tuned for the last article of this series, where we’ll shift to a pure Generative AI agent app, built with Vertex AI Agents, Google Cloud’s latest technology for agent development!

What’s Next?

Documentation

Data Stores

Data Store Agents

Dialogflow CX Generative Fallback

Vertex AI Agent Builder

Vertex AI Agents

dfcx-scrapi

Github Sources

https://github.com/grandelli/dfcx-datastore-hybrid-agent (the main GitHub repository for this project — contains the modified DFCX agent, the data stores, the revised Dialogflow Messenger client and the existing components coming from the deterministic agent)

Feel free to leave your comments here or connect with me on LinkedIn.