Extract Topic and Mentions from Amazon Product Reviews ft. Groq

Amal
ScrapeHero
Published in
3 min readJun 6, 2024

Are you tired of scrolling through endless product reviews, trying to extract valuable insights and trends? Look no further! In this blog post, we’ll explore how to leverage the power of Groq, a high-performance inference engine, to efficiently extract topics and mentions from Amazon product reviews.

By the end of this blog, you’ll have a structured output that provides a clear picture of what customers are saying about a particular product.

Why Extracting Topics and Mentions Matters

Product reviews are a goldmine of information for businesses and consumers alike.

They offer valuable insights into customer sentiment, product strengths and weaknesses, and potential areas for improvement.

By extracting topics and mentions from these reviews, you can:

  • Gain a deeper understanding of what customers value most about a product
  • Identify common pain points or issues that need to be addressed
  • Discover emerging trends and popular features that can inform future product development
  • Benchmark your product against competitors based on customer feedback

GROQ API:

We will be utilizing Groq API to infer open-source models available now, like Llama3, Gemma and Mistral.

To create an API key: https://console.groq.com/keys

Data:

We are going to use Amazon Product Review Data from ScrapeHero.

And we would be selecting Barbie brand reviews in the USA.

Column:

‘review_id’, ‘recommendation_percent’, ‘videos’, ‘brand’, ‘review_source’,
’is_verified_purchase’,’url’,’reviewed_country’,’qa_total_reviews’,’images’,
’author_badges’,’ships_from’,’product_aspects’,’rating_count’,’manufacturer’,
’no_of_people_reacted_helpful’,’image_urls’,’review_text’,’downvote_count’,
’recommendation_count’,’total_reviews’,’is_recommended_by_author’,’rating’,
’no_of_people_reacted_unhelpful’,’review_comment_count’,
’retailer_review_summary’,’retailer_product_id’, ‘warning’, ‘review_rating’,
‘review_title’, ‘input’, ‘promotional_review’, ‘sold_by’, ‘badge’,
’author_name’, ‘retailer’, ‘reviewed_variant_asin’, ‘review_url’,’variant’,
‘quality_rating’, ‘upvote_count’, ‘product_title’,
’reviewed_product_attribute’, ‘review_date’, ‘review_header’,’average_rating’

From the columns mentioned above, we are going to select review_text, rating and product title for prompt engineering for the LLM.

Installation

pip install -q groq pandas

Topic Extraction and Mentions from Barbie Reviews:

Let’s take a look at how we can extract topics and mentions from those topics using LLM powered by Groq.

from groq import Groq
import json
import pandas as pd

# Load the data
df = pd.read_parquet(“barbie_reviews.parquet”)

# Initialize groq api
groq_client = Groq(
api_key= “YOUR_API_KEY”, max_retries= 3
)

# Sample data
index = 7
review_text = df.review_text[index]
product_title = df.product_title[index]
rating = df.average_rating[index]

Review : “Very beautiful doll with a lovely skin tone. The only thing I don’t like about this doll is that she has false eyelashes, they look so out of place and take away from her beauty. Don’t show her with those lashes in the pictures, still an amazing doll though.”

Now, let’s prepare the prompt for the LLM .

system = f’’’
Given the following review, extract only main topics,
direct quote(sentences)of those topics. The topic should represent the broad
category or comment (one word each).

Output the response in JSON format with topic as key & quote as values

Product Title: {product_title}
Rating: {rating}
Review: {review_text}
‘’’

Time to call lightning-fast inference engine from Groq.

chat_completion = groq_client.chat.completions.create(
messages=[
{
“role”: “system”,
“content”: system,
},
],
model=”llama3–70b-8192",
temperature= 0.3,
response_format = {“type”: “json_object”}
)

answer = json.loads(chat_completion.choices[0].message.content)
print(answer)

Response:

{‘Beauty’: ‘very beautiful doll with a lovely skin tone’,

‘Criticism’: “the only thing i don’t like about this doll is that she has false eyelashes, they look so out of place and take away from her beauty”,

‘Deception’: “doesn’t show her with those lashes on the pictures”,

‘Overall’: ‘still an amazing doll though’}

Conclusion:

In conclusion, extracting keywords and mentions from Amazon product reviews using LLMs opens up a world of possibilities for businesses looking to harness the power of customer feedback. By utilizing Groq’s capabilities, you can efficiently process large volumes of data, identify key insights, and generate structured outputs that drive informed decision-making.

--

--