How to transform Customer Review Analytics with Snowflake Cortex

Published in

Snowflake Builders Blog: Data Engineers, App Developers, AI/ML, & Data Science

10 min readJul 5, 2024

Enabling customer review analytics by leveraging leadings LLMs with Snowflake Cortex to better customer experience

Introduction

Customer review analytics has become an indispensable tool for businesses seeking to gain a deep understanding of their customers’ experiences, preferences, and pain points. By effectively analyzing the wealth of feedback contained within customer reviews, companies can identify areas for improvement, unmet needs, and opportunities for innovation. However, the sheer volume of unstructured text data from reviews can make manual analysis a daunting and inefficient task. This is where large language models (LLMs) emerge as a game-changing solution, leveraging their remarkable natural language processing capabilities to automate and streamline the review analysis process. With industry leading LLMs available with Snowflake Cortex, businesses can efficiently extract granular insights, sentiment analysis, and actionable intelligence from vast amounts of customer feedback, enabling data-driven decision-making, product enhancements, and ultimately, improved customer satisfaction and loyalty.

Snowflake Cortex LLM Functions

Snowflake Cortex is an intelligent, fully managed service that hosts and serves top-performing LLMs and vector functions which quickly and securely process text and build AI applications contextualized with data stored in unstructured and semi-structured formats. We can access it via serverless SQL / Python functions. The available capabilities include:

Perform complex reasoning tasks with our text data and using our choice of supported top-tier language model using COMPLETE.
Extract information from unstructured or semi-structured data using EXTRACT_ANSWER.
Automatically get a synopsis of text with SUMMARIZE.
Detect the mood or tone of text using SENTIMENT.
Translate documents to other languages using TRANSLATE.
Get vector embedding for text using EMBED_TEXT_768 and EMBED_TEXT_1024.

Snowflake Notebook

Snowflake Notebooks is an interactive, cell-based development interface within Snowsight that combines Python and SQL. It enables exploratory data analysis, machine learning model development, and data science/engineering tasks in a single, integrated environment. With Snowflake Notebooks, we can analyze data, build models, and streamline your entire data workflow seamlessly.

Enabling Customer Review Analytics

In this article, we will explore customer reviews for a fictional food truck company called Tasty Bytes which operates across the globe. They collect customer reviews to get customer feedback on their food-trucks which come in from multiple sources and span multiple languages. This enables them to better understand the areas which require improvement and drive up customer loyalty along with satisfaction.

You can follow along by using this step by step guide.

We will look at how we analyze these collated customer reviews using Snowflake Cortex & Snowflake Notebook to understand :

How likely their customers are to recommend Tasty Bytes food trucks to someone they may know
How was our customer’s overall experience
How is the customer sentiment across customer base
What customers are feeling about different categories like food, price etc
What are the main areas of improvement highlighted in customer reviews

Translate multilingual reviews

We will start off by first translating all the non-English reviews to English to enable easier analysis and unlock the full potential of the customer feedback data to gain a comprehensive understanding of global customer sentiments. We do so by leveraging Translate — one of the Snowflake Cortex specialized LLM functions are available in Snowpark ML.

# Conditionally translate reviews that are not english using Cortex Translate
reviews_df = reviews_df.withColumn('TRANSLATED_REVIEW',when(F.col('LANGUAGE') != F.lit("en"), \
                                                            cortex.Translate(F.col('REVIEW'), \
                                                                             F.col('LANGUAGE'), \
                                                                             "en")) \
                                   .otherwise(F.col('REVIEW')))

reviews_df.filter(F.col('LANGUAGE') != F.lit("en")) \
.select(["REVIEW","LANGUAGE","TRANSLATED_REVIEW"]).show(3)

Categorize at scale

In this section, we will look at categorizing the customer reviews by utilizing Snowflake Cortex Complete function to understand:

How their overall experience based on their review
How likely are they to recommend Tasty bytes to their friends and family

Categorizing customer reviews at scale is essential for businesses to effectively manage and derive actionable insights from the vast amounts of customer feedback they receive. It enables efficient analysis, prioritization, trend monitoring, targeted improvements, and scalability, ultimately driving better customer experiences and business growth.

Get ratings based on review

Based on a given customer’s review, we extract the rating that customer is likely to rate their experience at the food truck by passing the review to the Snowflake Cortex Complete along with the prompt explaining the task. In our prompt, provide the instruction of the task along with a sample example of the completed task also known as one-shot learning.

# Prompt to get a rating based on a customer review
# We provide one shot incontext learning to better the answers we get from LLM
prompt = """[INST]
### 
You are tasked with rating cutsomer reviews for global food truck network called tasty bytes. \
Rating can be one of the following - awful, poor, okay, good, excellent such that awful is the worst \
possible rating and excellent is the best possible rating. Include only the rating in the output \
without any additional text. \
Rate the following review:
The "Freezing Point" ice cream truck in Seoul offered a mix of classic and unique \
options. The Sugar Cone and Waffle Cone were both fresh and crisp, providing a satisfying crunch. The \
Bottled Water and Ice Tea were standard, as expected. The standout, however, was the Mango Sticky Rice \
- a delightful blend of sweet and tangy, it was a refreshing twist to the usual ice cream flavors. The \
service was efficient, but lacked a personal touch. Overall, it\'s a decent choice for a quick, cool \
treat in the city.
Rating : good
Rate the following review: 
###"""

# Ask cortex complete and create a new column
review_df = reviews_df.withColumn('RATING', cortex.Complete('mixtral-8x7b', \
                                                            F.concat(F.lit(prompt), \
                                                                     F.col('REVIEW'), \
                                                                     F.lit("""[/INST]"""))))\
.withColumn('CLEAN_RATING', when(F.contains(F.lower(F.col('RATING')), F.lit('awful')), \
                                                            F.lit('awful')) \
             .when(F.contains(F.lower(F.col('RATING')), F.lit('poor' )), \
                                                            F.lit('poor')) \
            .when(F.contains(F.lower(F.col('RATING')), F.lit('okay')), \
                                                            F.lit('okay')) \
            .when(F.contains(F.lower(F.col('RATING')), F.lit('good')), \
                                                            F.lit('good')) \
            .when(F.contains(F.lower(F.col('RATING')), F.lit('excellent')), \
                                                            F.lit('excellent')) \
                                   .otherwise(F.lit('unsure')))

review_df.select(["REVIEW","CLEAN_RATING"]).show(3)

Get intention to recommend based on review

Similar to above, we ask Cortex Complete to tell us whether the customer is likely to recommend Tasty bytes to their friends and family based on the customer’s review and answer with one of the following — Likely, Unlikely, Unsure. We called the complete function on a Snowpark Dataframe column and created new columns which enables us to use the insights delivered by Cortex in any further data exploration.

# Prompt to understand whether a customer would recommend food truck based on their review 
prompt = """[INST]
### 
Tell me based on the following food truck customer review, will they recommend the food truck to \
their friends and family? Answer should be only one of the following words - \
"Likely" or "Unlikely" or "Unsure". Make sure there are no additional additional text.
Review -
###"""

# Ask cortex complete and create a new column
reviews_df = reviews_df.withColumn('RECOMMEND', cortex.Complete('snowflake-arctic', \
                                                            F.concat(F.lit(prompt), \
                                                                    F.col('REVIEW'), \
                                                                    F.lit("""[/INST]"""))))\
.withColumn('CLEAN_RECOMMEND', when(F.contains(F.col('RECOMMEND'), F.lit('Likely')), \
                                                            F.lit('Likely')) \
                                      .when(F.contains(F.col('RECOMMEND'), F.lit('Unlikely' )), \
                                                            F.lit('Unlikely')) \
            .when(F.contains(F.col('RECOMMEND'), F.lit('Unsure' )), \
                                                            F.lit('Unsure')) \
                                  .otherwise(F.lit('NA')))

reviews_df.select(["REVIEW","CLEAN_RECOMMEND"]).show(3)

Understand customer sentiment

So far we saw Snowflake Cortex — Translate & Complete. Next, we will look at another task specific LLM function in Cortex — Sentiment. This sentiment function is used to understand customer’s tone based on the review they provided. Sentiment return value between -1 and 1 such that -1 is the most negative while 1 is the most positive.

# Understand the sentiment of customer review using Cortex Sentiment
reviews_df = reviews_df.withColumn('SENTIMENT', cortex.Sentiment(F.col('REVIEW')))

reviews_df.select(["REVIEW","SENTIMENT"]).show(3)

Dive deeper with aspect based sentiment

We can take our analysis a step further with extracting aspect based sentiment. Aspect-based sentiment analysis (ABSA) is an advanced form of sentiment analysis that goes beyond simply classifying a piece of review as positive, negative, or neutral. Instead, here it identifies and analyzes the sentiment expressed towards specific aspects or categories like service, price, food quality etc mentioned in the review. It provides granular, actionable insights into specific aspects of products or services, enabling businesses to make data-driven decisions for improving customer experiences, enhancing offerings, and gaining a competitive edge.
It is made pretty easy to achieve with Cortex Complete function coupled with a prompt that includes one shot example as seen below.

# Prompt to understand sentiment for different categories mentioned in the customer review
# We employ one shot incontext learning to inform LLM
prompt = """[INST]
### 
You are analyzing food-truck customer reviews to undertsand what a given review says about different relevant categories like \
food quality, menu options, staff, overall experience, price, ambience, customer support, \
hygiene standards etc and if sentiment is negative,positive or neutral for that category. \
Only answer in a single valid JSON containing "category", "sentiment" and "details". \
Make sure there is no additional text and not mention categories in answer which are not \
talked in the review. \
Get category based sentiment for the following customer review:
"This food truck offers a disappointing experience. \
The menu lacks healthy options and the food quality is subpar. Finding a parking spot near the \
truck can also be a frustrating ordeal. Additionally, the value for money is not worth it. To top \
it all off, the service provided at this food truck is less than pleasant, adding to the overall \
negative dining experience. Tried reaching out the customer support but was unable to get through." 
Answer : [{     "category": "food quality",     "sentiment": "negative",    "details": "subpar quality"   }, {     "category": "menu options",     "sentiment": "negative",     "details": "lacks healthy options"   },   {     "category": "staff",     "sentiment": "negative",     "details": "unpleasant"   },   {     "category": "price",     "sentiment": "negative",     "details": "not worth the money"   },   {     "category": "experience",     "sentiment": "negative",     "details": "regrettable dining experience"   },   {     "category": "customer support",     "sentiment": "negative",     "details": "unable to get through"   } ].
Get category based sentiment for the follwoing customer review:
###"""

# Ask Cortex Complete and create a new column
review_df = reviews_df.withColumn('CATEGORY_SENTIMENT', cortex.Complete('mixtral-8x7b', \
                                                            F.concat(F.lit(prompt), \
                                                                    F.col('REVIEW'), \
                                                                    F.lit("""Answer:[/INST]"""))))
review_df.select(["REVIEW","CATEGORY_SENTIMENT"]).show(1)

Identify issues

Now that we have extracted customer sentiments from the reviews, we will identify the areas for improvement highlighted in the negative reviews. This enables the business to make data-driven decisions and continuously improve their overall customer experience. We will identify the issues that are mentioned in customer reviews and understand:

What is going wrong at business level?
What is going wrong at truck level?

Issues at business level

In this step, 100 most negative reviews are aggregated and provided to Snowflake Cortex — Complete along with a prompt to identify the main issues found in those reviews.

# Aggregate the 100 most negative reviews for tasty bytes
reviews_agg_ = reviews_df.order_by(F.col('SENTIMENT')).select(F.col('REVIEW')).first(100)

reviews_agg_str = ''.join(map(str,reviews_agg_))

# Prompt to summarize the three top issues flagged in the aggregated reviews
prompt = """[INST]###Summarize the issues mentioned in following aggregated food truck customer reviews with three \
concise bullet points under 50 words each such that each bullet point also has a heading along with \
recommendations to remedy those issues.###""" + reviews_agg_str + """[/INST]"""

# Answer from Cortex Complete
print(cortex.Complete('mistral-large', prompt))

Issues at truck level

Aggregate sentiment at truck level using Snowpark Dataframe functions.

# Get average sentiment of reviews for Trucks 
truck_agg_reviews_df = reviews_df.groupBy(F.col('TRUCK_ID')) \
                .agg(F.avg(F.col('SENTIMENT')).alias('AVG_SENTIMENT'),F.count(F.col('REVIEW_ID')).alias('REVIEW_COUNT')) 
truck_agg_reviews_df.show(3)

Average sentiment by truck is utilized to find the truck which is the most negatively reviewed truck and has at least 10 reviews.

# Get the truck with most negative average sentiment
truck_agg_reviews_df.filter(F.col('REVIEW_COUNT') >= 10).order_by(F.col('AVG_SENTIMENT')) \
                .select(F.col('TRUCK_ID')).limit(1).collect()[0][0]

Quick analysis of 100 most negative reviews for the most negatively reviewed truck to understand the main issues that the customers complain about by leveraging Snowflake Cortex — Complete. Similar to how it was performed at the business level.

# Aggregate the most negative reviews for Truck 5
reviews_agg_ = reviews_df.filter(F.col('TRUCK_ID') == cells.MOST_NEGATIVELY_REVIEWED_TRUCK)\
              .order_by(F.col('SENTIMENT')).select(F.col('REVIEW')).first(100)

reviews_agg_str = ''.join(map(str,reviews_agg_))

# Prompt to understand the main issues with Truck 5
prompt = """[INST]###Summarize three main issues mentioned in following aggregated customer review with three concise bullet 
points under 50 words each such that each bullet point also has a heading.###""" + reviews_agg_str + """[/INST]"""

# Print Cortex Complete's answer
print(cortex.Complete('mistral-large', prompt))

Action on insights

Armed with insights that we got out of the customer reviews using Cortex, we utilize Snowflake Cortex — Complete to draft an email to the owner of the most negatively reviewed truck summarizing the issues that are highlighted in customer reviews along with any recommendation to remedy those issues.

# Prompt to get an email draft which reports the main issues with Truck 5 with recommendations to solve
prompt =""" [INST]### Write me survey report email to the franchise owner summarizing the issues mentioned in following \
aggregated customer review with three concise bullet points under 50 words each such that each bullet \
point also has a heading along with recommendations to remedy those issues.###"""+ reviews_agg_str +""" \
Mention the truck brand name and location in the email.[/INST]"""

# Print the result from Cortex Complete
print(cortex.Complete('mixtral-8x7b', prompt))

Conclusion

To summarize, the integration of Snowflake Cortex has revolutionized customer review analytics for businesses, enabling them to harness the power of AI in seconds. With Snowflake Cortex, businesses can quickly run inference on industry-leading large language models (LLMs) by utilizing Complete, hosted within the Snowflake platform. Specialized LLM functions like Translate and Sentiment facilitate tasks like translation and sentiment analysis without requiring prompt engineering, boosting efficiency. Snowflake Notebook, integrated within Snowsight, offers a versatile development interface for exploring data using SQL, Python, and Markdown cells. This seamless integration allows businesses to analyze data, visualize results, and derive actionable insights swiftly. By automating tasks such as translation, categorization at scale, and sentiment analysis, businesses gain a comprehensive understanding of customer sentiment and preferences, driving data-driven decision-making and targeted improvements. Additionally, by identifying issues and taking proactive actions, businesses can continuously enhance offerings and maintain competitiveness in the market.

To implement what you saw in this article, follow this step by step guide.

Related Resources

Want to learn more about the tools and technologies used in this article? Check out the following resources: