GenAI: Deconstruct place reviews

PaLM 2: Magic Wand to filter pros and cons of tourist places.

Aniket Agrawal
Google Cloud - Community
6 min readOct 17, 2023

--

In the realm of travel, where every destination whispers a unique tale, the voices of fellow exploring tourists echo through the corridors of online reviews. Amidst this symphony of experiences, Generative AI emerges as a maestro, conducting a harmonious summarization of tourist place reviews to instantly get the gist of thousands of such opinions. Sounds too good to be true, right?

Leveraging Google Cloud GenAI Tech (Vertex AI PaLM 2 API) for Tourist Place Review Insights

What are we doing exactly?

W e leveraged Colab Enterprise and a large publicly available Kaggle dataset (>1 million rows) containing Indian cities, tourist places, raw reviews, and ratings. We then concatenated the reviews for each tourist place to create a more comprehensive and informative representation of each location.

Then, with its deft touch, the generative AI model sifts through the vast expanse of these concatenated reviews, distilling the essence of each place into concise, informative snippets. Like a seasoned travel guide, it paints a vivid picture of each destination, highlighting its captivating charm and potential drawbacks. Have a quick look at the below GIF for a few results:

Gradio UI: It could be any Indian city; this GIF provides highlights of various places in ‘Bengaluru’.

Note: Please note that the pros and cons shown in the above UI reflect the views of neither myself nor Google AI. The model just pulls and plays with the dataset’s reviews without contributing any information or hallucinations of its own.

Enough philosophy; let’s go to technical business. How are we doing it?

Zero-shot prompting (ZSP) is a powerful technique for using large language models (LLMs) to perform tasks without any explicit training or examples. So, we’re dependent mainly on the LLMs’ language abilities.

ZSP: Basic Prompting Technique with ‘Zero’ Examples

For ZSP, it is tricky to create prompts that are explicit enough to instruct the LLM to produce the intended output while still being general enough to allow the LLM to use its knowledge creatively. Furthermore, evaluating the LLM performance is difficult as we do not explicitly indicate the correct output. Because the use case mentioned here is so simple, ZSP suffices.

We use ZSP in order to get the pros and cons of a place from reviews by using the following single-line prompt:

review_prompt = '''try to make sense of the following reviews for the 
tourist place and list down the pros and cons of the place:'''

The Google Cloud Console provides a Model Garden where multiple Vertex AI models and APIs can be explored. For review summarization, we specifically rely on text-bison@001’, one of the foundational models.

But wait, what are foundational models, and how are they named?

Foundation models are pre-trained multitask large models that can be tuned or customized for specific tasks using Generative AI Studio, the Vertex AI API, and the SDK for Python.

Their names only have two components: ‘use case’ and ‘model size’ if they are of the latest version. The naming convention for its stable counterpart appends a 3-digit version and is in the format <use case>-<model size>@<3-digit-stable-version>. So, the model text-bison@001 is a stable Bison model with version ‘001’ for text-related use cases. Simple, right?

Foundational Models (‘Bison’ highlighted), Image Source: Google I/O 2023

Here, our use case is so welcoming to beginners that you may not notice the difference between the outputs of different model versions.

Product Details

Before diving into the implementation details and code, let’s look into the products used in greater detail:

i) Colab Enterprise was recently launched in Google Cloud Next ’23. It is a managed notebook environment that combines the power of Google Cloud with the ease of use of Colab.

It’s very simple to create a Colab Enterprise Notebook, isn’t it?

ii) Pandas dataframes can be queried with SQL syntax using PandaSQL. It aims to provide a more comfortable way of handling and cleaning data for those who are familiar with Python or Pandas. The SQL query can be specified inside sqldf(‘Query’) as follows:

import sqldf
sqldf('''SELECT * FROM df''')
Pandas and Bison join forces for our tourism application :) Image Source: https://pixabay.com/

iii) Bison is a foundational model from Google AI, standing as a titan in the realm of language understanding and generation. As mentioned before, we specifically rely on ‘Text-Bison’ and ZSP for our simplified use case.

model = TextGenerationModel.from_pretrained("text-bison@latest")
review_place = "Great place for team offsite..."
model.predict(review_prompt + review_place, **parameters).text

Implementation Time

Assuming that the notebook has been created, first things first! Install the required packages, set up authentication, a project ID, location variables, etc. Then, import the required libraries.

# For interacting with Generative AI Studio, install the Vertex AI SDK.

!pip install google-cloud-aiplatform # AI Platform Python library

# Restart the kernel now

!pip install pandasql # Library to query pandas dataframe
!pip install SQLAlchemy==1.4.17 # Python SQL toolkit
PROJECT_ID = "<Project-ID>"  # @param {type:"string"}
REGION = "us-central1" # @param {type:"string"}
from google.cloud import aiplatform
aiplatform.init(project=PROJECT_ID, location=REGION)
# Import all required libraries

import pandas as pd
from pandasql import sqldf
from vertexai.preview.language_models import TextGenerationModel
# Read and filter the Kaggle Dataset

df = pd.read_csv('Place_Review.csv')
df1 = df.drop(['Raw_Review', 'Name', 'Date'], axis=1)
# Create and execute the SQL query after populating the city placeholder 
# with the user input and concatenation of its places' reviews

city = input(str('Please enter the city you want to explore:'))
sqlstring = '''SELECT Place, ROUND(AVG(Rating),2) as Av_Rating, GROUP_CONCAT(Review,' ') Review FROM df1 Group by Place Having City = '{}' '''.format(City)
df2 = sqldf(sqlstring)
# Now, declare the model, its parameters, and the prompt.

model = TextGenerationModel.from_pretrained("text-bison@latest")

parameters = {
"temperature": 0.3, # higher temperature -> more creative response (randomness)
"max_output_tokens": 1024, # obvious, by name
"top_p": 0.7, # higher top_p -> more diverse text
"top_k": 40 # higher top_k -> more candidates and fluent text
}

review_prompt = '''try to make sense of the following reviews for the
tourist place and list down the pros and cons of the place:'''
# Finally, take the 'place' input from the user and 
# pass its concatenated reviews with the prompt to get results.

place = input(str('Please enter the place:'))
place_review = df2[df2['Place'].str.contains(place)].iloc[0]['Review']

model.predict(review_prompt + place_review, **parameters).text

Finally, we code all of these functionalities into a Gradio UI (shown above) to enhance the user experience.

Conclusion

In this GenAI beginner-friendly blog, from the bustling streets of ancient cities to the serene embrace of nature’s wonders, we navigate the diverse landscape of tourist reviews. To capture the essence of each experience, ZSP reveals hidden gems (pros) and steers travellers away from potential pitfalls (cons), filtered from 100s of reviews. Stay curious, adios, and see you soon with another blog!

Note: Should you have any concerns or queries about this post or my notebook implementation, please feel free to connect with me on LinkedIn! Thanks!

Feel free to leave comments below. You may mention the Indian city for which you want a similar GIF. I will try my best to create and paste that here later!

Reference links

--

--

Aniket Agrawal
Google Cloud - Community

AI/ML | Cloud Engineer at Google, GenAI | Cybersecurity | ML | NLP | Image Processing Research Enthusiast https://www.linkedin.com/in/aniket-agrawal-a18990266/