BigQuery, Gemini & Google Search: Grounding Generative AI for Accurate Information

Olejniczak Lukasz
Google Cloud - Community
5 min readJun 29, 2024
Source: https://www.polskieradio.pl/395/7790/Artykul/3395764,football-poland-draw-11-with-france-in-euro-2024-farewell

The thrill of Euro 2024 is electrifying stadiums across Germany. Interestingly, the Polish national team has embraced a well known strategy, inspired by their country’s football tradition with the first match seen as an opening, the middle match that determines everything, and the last one played for honor (these are often the best for us). BTW I am polish and witnessed this strategy in action few times. This philosophy has been ingeniously applied to their Euro 2024 campaign and thanks to this we have more time to prepare to the next tournament ;) But jokes aside. I don’t usually watch full matches, but I like to stay updated on the key details online. Google Search is my go-to for quickly finding the latest scores, team standings, and match highlights. And as someone who spends most of their time working with generative AI, I’m constantly reminded of the crucial role Google Search plays in this AI transformation.

So, this article isn’t about football. It’s about showcasing a truly remarkable combination of Google’s cutting-edge technologies: BigQuery, Gemini, and Google Search when you merge the data processing power of BigQuery with the advanced multimodal capabilities of Gemini and efficiency, relevancy, and factuality of Google Search. This fusion can change how we — users and businesses interact with information one search or query at a time but only if we can trust generated contents.

In one of my previous articles on Medium entitled [10 Things People Get Wrong About AI models (Don’t Be One of Them!)] second myth on the list is named: The AI Encyclopedia: Is Everything on the Internet Stored Inside Language Models.

These models are specifically designed to generate human-like text based on the patterns they’ve learned during training and as such they can answer questions, summarize information, or even create content, but they don’t have a perfect recollection of every piece of information they’ve encountered. If you think of them as AI Encyclopedia the reality is that even the most sophisticated AI model, have their limitations. Their knowledge isn’t a solid block; it’s more like a piece of Swiss cheese.

This is especially true if you take into account that world is constantly changing, and new information is emerging all the time. Language models cannot keep up with this pace in real time. They rely on periodic updates and retraining to stay relatively current, but there will always be a gap between the most recent information and what the model has been trained on.

There’s strength in numbers, or as they say in Poland, “w grupie siła.” Let’s explore how the powerful trio of BigQuery, Gemini, and Google Search can bring this proverb to life in the digital realm, helping us build trust into the information presented by Generative AI systems.

Asking Gemini models in BigQuery is as easy as running a SELECT statement with the ML.GENERATE_TEXT function (for more details, check out my article [BigQuery and Gemini: The Catalyst for Scaling Generative AI Skills]. So, let’s see if Gemini can accurately answer: “What was the result of the Poland vs France match at Euro 2024?”

SELECT *
FROM
ML.GENERATE_TEXT(
MODEL `genai-app-builder.speech2textdataset.vertexaigemini`,
(
SELECT CONCAT('What was the result of the Poland vs France match at Euro 2024?') AS prompt

),
STRUCT(
0 AS temperature,
1024 AS max_output_tokens,
0.2 AS top_p,
15 AS top_k,
TRUE AS flatten_json_output)
);

The response is that Gemini does not have access to real-time information or information about future events, including the results of sporting matches. Therefore, I cannot provide you with the result of the Poland vs France match at Euro 2024. The Euro 2024 tournament is scheduled to take place from June 14 to July 14, 2024:

Google Search would know …

So, how do we seamlessly integrate Gemini, BigQuery, and Google Search? As it turns out, BigQuery natively supports this integration. To leverage Google’s Grounding mechanism against Google Search and get the most up-to-date and accurate information, all you need to do is set the ground_with_google_search parameter to TRUE.

We are ready to repeat our test:

Well done BigQuery + Gemini + Google Search!

But you might be wondering, how can we be sure about the accuracy of this answer? The beauty of this solution is that the result includes a column called ml_generate_text_grounding_result, which provides citations with confidence scores to back up the generated response. This gives us additional assurance of the answer's validity.

Summary:

While generative AI holds immense potential in synthesizing information, concerns about the accuracy and reliability of AI-generated content have emerged. Google’s efforts are focused on grounding AI generated responses in factual information to mitigate these concerns. By combining the capabilities of generative AI to synthesize the information with the trustworthiness of traditional search results available with unique blend of Gemini with Google Search and BigQuery we can bring the advantages of both worlds to users already today.

This article is authored by Lukasz Olejniczak — Customer Engineer at Google Cloud. The views expressed are those of the authors and don’t necessarily reflect those of Google.

Please clap for this article if you enjoyed reading it. For more about google cloud, data science, data engineering, and AI/ML follow me on LinkedIn.

--

--