[MLStory] A Guide to Using VertexAI and Google’s Embeddings for Generative Projects

Ashmi Banerjee
Google Developer Experts
2 min readSep 11, 2023

In this step-by-step tech tutorial, we’ll explore an alternative approach to tackle a common use case we discussed in a previous blog:

Imagine you own a bustling restaurant with a multitude of customer reviews. Your goal is to gain meaningful insights from this wealth of data. This includes identifying menu favorites, understanding customer preferences, and finding areas for enhancement.

In this tutorial, we’ll leverage the power of generative AI to extract these insights efficiently.

By the end of this tutorial, you’ll be able to identify popular menu items, understand customer sentiments, and pinpoint areas for improvement without manually sifting through hundreds of reviews.

Problem Statement

Before we dive into the technical details, let’s briefly grasp the concept of embeddings.

Embeddings are numerical representations of text, translating words and phrases into a format that computers can comprehend.

  • They’re crucial for various applications, including search and recommendation systems.
  • Similar texts share similar embeddings, making them ideal for our analysis.

Our approach involves using VertexAI’s TextEmbeddingModel to generate embeddings for each review and then comparing them against a query using their dot product.

The dot product’s value ranges from -1 to +1: -1 signifies dissimilarity, 0 indicates orthogonality, and +1 signifies similarity.

Proposed Workflow using Text Embeddings

Let’s look at a sketch of our algorithm:

Advantages

  • Suitable for similarity-based recommendations and search methods.

Drawbacks

  • Doesn’t generate new text but rather filters the most similar reviews.
  • Embeddings must be computed on the fly or stored in memory, which is potentially computationally expensive, depending on your data volume.

By the end of this tutorial, you’ll have a practical understanding of how to apply VertexAI and embeddings to gain valuable insights from textual data efficiently. Let’s get started! 🚀

The source code on GitHub can be accessed here.
The references and further readings on this topic have been summarized here.

If you like the article, please subscribe to my latest ones.
To get in touch, contact me on
LinkedIn or via ashmibanerjee.com.

--

--

Ashmi Banerjee
Google Developer Experts

👩‍💻 Woman in tech, excited about new technical challenges. You can read more about me at: https://ashmibanerjee.com/