June 2024 : Your guide to GenAI articles @ Google Cloud Medium

Romin Irani
Google Cloud - Community
11 min readJul 5, 2024

Google Cloud Medium has seen great contributions in the form of articles submitted by community members. The publication has seen 500+ articles in this year alone. First up, a big thank you to all the contributors and the readers.

Here are some interesting stats for everyone:

  • 500+ articles till date from January 1, 2024
  • Generative AI is a trending topic and as you would have guessed, we are seeing close to 40–50% articles submitted month on month (April — June 2024) timeframe, catering to the topic of Generative AI alone.
  • 200+ authors have submitted articles on Google Cloud Medium in 2024. If you’d like to become one, its simple : Reach out in the comments with your medium id and you’ll be added as a writer.

Now coming to the articles that I am classifying under Generative AI. I am broadening its definition and am including articles that cover the following:

  • Generative AI topic in general
  • Vertex AI Services , including Agent Builder
  • BigQuery ML, Vertex AI Integration
  • LangChain and other GenAI tools/frameworks

and other similar technologies. You get the drift.

So, lets dive in to the articles. I have done my best to give a commentary and to classify the articles accordingly. All mistakes are mine :-)

Prompt Engineering

Lee Boonstra has started a series on Prompt Engineering, sharing her experiences in this area, having worked on major business use cases, working on large innovation projects for a selection of clients, including automating drive-thru orders at Wendy’s. Check out the two installments in the series:

Erwin Huizenga tackles an interesting question in The prompt paradox: Why your LLM shines during experimentation but fails in production. The article first talks about overfitting and then introduces the reader to why Generalization matters in Prompt Engineering, via a series of examples.

Gemini Flash

Rubens Zimbres takes us through high level features of Gemini Flash and then Gemini Flash is put to the test via a Langchain evaluation notebook. Rubens gives a deep dive into “Knowledge Distillation” technique that was used in the creation of Gemini Flash and which as the blog post states is “a technique used in deep learning to transfer knowledge from a large, complex model (teacher) to a smaller, simpler model (student) while aiming to retain the accuracy of the teacher model.” Several multi-model use cases are tested out in the article titled Multimodality with Gemini-1.5-Flash: Technical Details and Use Cases.

Gemini and Function Calling

The ability of models to invoke external functions is opening up interesting possibilities. You surely didn’t think about what the next set of authors have done with it.

First up, Karl Weinmeister takes on the task on demonstrating how you can take an interesting piece of code in a Colab notebook and turn it into an application. Wait … that’s not the end of it. What if you could take this application and also host it on Google Cloud Run. Check out From notebook to Cloud Run service in 10 minutes: applied to Gemini Function Calling.

Guillaume Laforge needs no introduction when it comes to the Groovy language. In the article Let’s make Gemini Groovy!, Guillaume demonstrates how you can make Gemini invoke a Groovy script for you, all via the power of function calling.

Caching and GenAI Applications

As GenAI applications are considered for production, costs are being discussed. Caching is now being considered as a solution to innovatively address this important requirement. Gemini recently introduced context caching. At a high level, what it means is that “you can pass some content to the model once, cache the input tokens, and then refer to the cached tokens for subsequent requests”.

Olejniczak Lukasz covers a great use case in the article titled Practical Guide: Using Gemini Context Caching with Large Codebases. The article uses Gemini’s context caching feature to ask questions on full codebase from a Git repository. The article demonstrates how successive calls to the model results in a signficantly smaller number of tokens being passed.

Arun Shankar discusses optimization in the context of the term “Semantic Caching”. As the author states “It allows for a cached response for repetitive semantically-similar queries, eliminating the need to resort to your AI provider. This results in cost savings and latency reduction.” The article Implementing Semantic Caching: A Step-by-Step Guide to Faster, Cost-Effective GenAI Workflows provides not just an overview of Semantic Caching but a step by step detailed guide to implementing it using Google Cloud services.

Gen AI Use Cases

Marc Cohen has started a 6-part series of articles to help us understand how the super app Quizaic was built. Quizaic uses generative AI to create high quality trivia quizzes and manage the interactive quiz playing experience. If you are looking to build out a GenAI application from scratch starting with design, making technology choices, hosting your application for scale and more, this is the series to bookmark and read.

📹 Video : https://www.youtube.com/watch?v=2mpCsRZcKmk

3 posts in the series have been published in June.

Abirami Sukumaran has published a 3-part series that shows how to build a Smart Retail Assistant. As the first part in the series states, it highlights the following:

  • Build a knowledge-driven chat application designed to answer customer questions, guide product discovery, and tailor search results.
  • Combine several Google Cloud technologies: AlloyDB, Agent Builder, Gemini models and more.

Check out the series here:

Whisper is an automatic speech recognition (ASR) system trained on 680,000 hours of multilingual and multitask supervised data collected from the web. Ivan Nardini mentions that its accuracy is poor for low-resource languages. One way to address is to the fine-tune the model on limited data and Ivan takes on the task of tuning and serving Whisper with Ray on Vertex AI in the article titled Whisper Goes to Wall Street: Scaling Speech-to-Text with Ray on Vertex AI — Part I.

Looking to build a custom classification API for text, images and more. Maximilian Weiss has written a detailed article on how you can do that. The article goes first into the theory of product taxonomies, classification using embeddings and more. Then the article dives into various Google Cloud services to achieve that. Check out the article Building a Custom Classification API on Google Cloud: A Technical Deep Dive.

In a retail scenario, ensuring the accuracy and quality of product information is important. Neerajshivhare discusses a retail use case, where Gemini Pro Vision and Vertex AI are used to verify the accuracy of product attributes provided by sellers. The solution utilizes product attribute metadata, sample data and images to verify. Check out the solution in Revolutionizing Retail: Automated Attribute Identification with Google’s Gemini Pro Model.

Not all use cases would need GenAI. Dazbo (Darren Lester) has written detailed articles on Google Cloud and this article titled Building a Serverless Image Text Extractor and Translator Using Google Cloud Pre-Trained AI is no exception. The solution demonstrates developing an end-to-end application on Google Cloud that uses Cloud Run, Cloud Functions, Google Cloud’s pre-trained ML APIs (Vision API, Natural Language API, Video Intelligence API and Translation API).

Agents, AI Agents, Agentic workflows

Agentic workflows are often referred to as the way that most of our Gen AI applications will move towards. Fermin Blanco has started a series on this, the first of which describes an agentic workflow (perceive environment, make decision, take action, get feedback and learn) and how it all comes together. Check out My Journey to 🤖 AI Agents: The Backbones.

Reasoning Engine service (also called LangChain on Vertex AI) provides a managed runtime for your customized agentic workflows in generative AI applications. You can create an application using orchestration frameworks such as LangChain, and deploy it with Reasoning Engine. Johanes Glenn conducts an investigation on how it works by reworking an earlier use case that utilized Gemini Pro vision.

📹 Video : https://www.youtube.com/watch?v=U3nhTmKmBH8

Check out Johanes article titled Exploring Google Cloud Reasoning Engine

GenAI and Security

This is an interesting field and Imran Roshan gives us an overview of Security-Focused Large Language Models (SecLMs) and ways of building that on Google Cloud. Check out the article titled SecLM — What? Why? and How?

Sita Lakshmi Sangameswaran highlights that usage of GenAI has brought attention back to securing these applications. As the blog post mentions, hope cannot be a strategy. The post titles Is your AI workload secure? then looks at the key principles of a robust AI Framework.

Gemini as a Networking Expert

Consider the following scenario: We have two retail sites connected through a Telco-managed L3 VPN service. You need to do the following tasks: IP address planning, network topology summarization, configuration script generation, and documentation, which will ultimately streamline the delivery of L3 VPN services. Can Gemini help? Neelaksh Sharma demonstrates how we can use the Gemini consumer application itself to help us out in the article Google Gemini for Network Configuration Assistance. No sophisticated stuff, just Gemini available through the web interface and some magical prompts.

Conversational AI Agents

Gabriele Randelli is in the process of creating a mega tutorial series that demonstrates creating Dialogflow CX Agents. In the first article, Gabriele designs a conversational agent using Dialogflow in a deterministic fashion, while using Gemini model to extract the date of birth from an image of a Motor Driving license. The author then steps up the complexity with a post titled Designing Data Store Hybrid Agents with Dialogflow CX & Vertex AI Agents. Notice the word “Hybrid Agent” where the article addresses scenarios where one is not able to match the intent and then falls back on generative callbacks to address it. The article highlights lesser known features of Dialogflow Messenger Bot and how one can customize and listen in on events. Worth a read.

Vertex AI Agent Builder surely makes it simple to create a conversational agent. An ever popular use case is one of talking to your PDF document. If you have never tried our Vertex AI Agent Builder, this tutorial by Aryan Irani titled Tutorial: Vertex AI Agent Builder for Developers is a good step-by-step guide.

Gen AI App Dev

How do you send a list of questions (prompts) to the model and wait for all the answers to come back. That’s what Paul Balm calls “prompting the model asynchronously” and shows you how to do that in the article titled How to prompt Gemini asynchronously using Python on Google Cloud. The article uses the Python packages asyncio and tenacity to invoke the same asynchronously.

Maddula Sampath Kumar has written a definitive set of articles on deploying your Gen Streamlit applications to Google Cloud Run. Streamlit is popular with Gen AI developers and Cloud Run should be too! The first post focuses on just deploying a Streamlit app, with no Gen AI involved. The next article utilizes Gemini Flash and its interesting to note that Sampath has used the OpenAI interface to invoke Gemini Flash.

Check out the blog posts:

Continuining on Streamlit, is there a competitor to that framework on the horizon? It is definitely super early days but the Mesop framework from Google, caught a bit of attention. Om Kamath has a provocatively titled article Did Google Just Kill Streamlit?

We’ve seen several ways to build our RAG applications. Nathaly Alarcón in the article titled Unlock the Power of Conversational AI: RAG 101 with Gemini & LangChain covers a detailed tutorial on building a conversational RAG application that utilizes Google Gemini AI, LangChain, ChromaDB as the Vector Store (for PDF documents).

How often have we struggled to control the output of our LLMs? Sascha Heyer says that it is a thing of the past with Controlled Generation of hte output that was announced recently at Google I/O. Check out the article titled Vertex AI Controlled Generation with Gemini.

Testing LLM Outputs

How do you know if your LLM is performing as per your expectations? Do you have a suite of test cases to try out? How does one go about addressing a requirement like that? Mete Atamel says that while there is a Vertex AI Model Evaluation service, it was interesting to build out an approach to testing our your LLM. In the article titled Give your LLM a quick lie detector test, Mete takes us through Python application code on how you would design and implement a Test Suite for such a requirement.

BigQuery and Gemini

Olejniczak Lukasz deep dives into how BigQuery and its underlying support for invoking Gemini models is used to analyze call center recordings (audio files) via the article BigQuery and Gemini: The Catalyst for Scaling Generative AI Skills. The author then follows up with an article titled BigQuery, Gemini & Google Search: Grounding Generative AI for Accurate Information, where you can learn about how you can combine BigQuery, Gemini and ground your results with Google Search, all while still using a simple SQL syntax.

Apps Script and Gemini

I have believed for some time that Google Workspace tools along with Apps Script, presents a fantastic opportunity to integrate and build solutions for a market that has millions of active users per day.

Kanshi Tanaike has been the #1 contributor on our Google Cloud Medium publication and he has authored some fantastic ways into which you can integrate Gemini + Apps Script.

Expanding Error Messages for Google Apps Script with Gemini 1.5 Flash
This article uses Gemini to help expand (provide more detailed information) on error messages that come up in your Apps Script code. The article builds out a function that takes a current message and then uses Google Generative AI behind the scenes to get more information on the error message. What is interesting in this article is that Kanshi replaced the Gemini 1.0 Pro model (covered in an earlier article) with the Flash model, to cut down on the response times. This is critical since Google Apps Script currently has a maximum execution time of 6 minutes.

Unlock Smart Invoice Management: Gemini, Gmail, and Google Apps Script Integration
This is an Invoice Processing application built entirely with Apps Script and Gemini. It is actually a culmination of multiple articles written by Kanshi that formed the foundation of these applications. The application is integrated into your Gmail, when an incoming email with an invoice triggers the whole process.
📹 Video : https://www.youtube.com/watch?v=Dc2WPQkovZE
🗂️ Code: https://github.com/tanaikech/UnlockSmartInvoiceManagementWithGeminiAPI

GenAI Trends and Opinions

We have several high-level articles that highlight GenAI trends, how it could be applied to different industries and more. Check them out:

Allan Alfonso takes a look at how education has changed with the advent of Google Search and Youtube and then highlights further efforts being made via various initiatives in the blog post titled Leveling the Playing Field: Google’s LearnLM and the Quest for Equitable Education.

Rahul Ranganathan: The AI-Powered Evolution of Technical Debt Management

Madhu Vadlamani: The Era of Substantial AI: A Generative AI Cookbook and The Trio — BigQuery: Empowering Gen AI and Gemini for Complex Problem Solving.

Ashutosh Madan: The Transformational Power of Artificial Intelligence in the Telecommunications Industry

Feedback

Thank you for reading. If you liked this roundup, please drop in some feedback in the comments on how I could do better and any ideas that you might have to taking this to the next level. Thank you for your time.

--

--