Extract Topic and Mentions from Amazon Product Reviews ft. Groq

Published in

ScrapeHero

3 min readJun 6, 2024

Are you tired of scrolling through endless product reviews, trying to extract valuable insights and trends? Look no further! In this blog post, we’ll explore how to leverage the power of Groq, a high-performance inference engine, to efficiently extract topics and mentions from Amazon product reviews.

By the end of this blog, you’ll have a structured output that provides a clear picture of what customers are saying about a particular product.

Why Extracting Topics and Mentions Matters

Product reviews are a goldmine of information for businesses and consumers alike.

They offer valuable insights into customer sentiment, product strengths and weaknesses, and potential areas for improvement.

By extracting topics and mentions from these reviews, you can:

Gain a deeper understanding of what customers value most about a product
Identify common pain points or issues that need to be addressed
Discover emerging trends and popular features that can inform future product development
Benchmark your product against competitors based on customer feedback

GROQ API:

We will be utilizing Groq API to infer open-source models available now, like Llama3, Gemma and Mistral.

To create an API key: https://console.groq.com/keys

Data:

We are going to use Amazon Product Review Data from ScrapeHero.

And we would be selecting Barbie brand reviews in the USA.

Column:

‘review_id’, ‘recommendation_percent’, ‘videos’, ‘brand’, ‘review_source’,
’is_verified_purchase’,’url’,’reviewed_country’,’qa_total_reviews’,’images’,
’author_badges’,’ships_from’,’product_aspects’,’rating_count’,’manufacturer’,
’no_of_people_reacted_helpful’,’image_urls’,’review_text’,’downvote_count’,
’recommendation_count’,’total_reviews’,’is_recommended_by_author’,’rating’,
’no_of_people_reacted_unhelpful’,’review_comment_count’,
’retailer_review_summary’,’retailer_product_id’, ‘warning’, ‘review_rating’,
‘review_title’, ‘input’, ‘promotional_review’, ‘sold_by’, ‘badge’,
’author_name’, ‘retailer’, ‘reviewed_variant_asin’, ‘review_url’,’variant’,
‘quality_rating’, ‘upvote_count’, ‘product_title’,
’reviewed_product_attribute’, ‘review_date’, ‘review_header’,’average_rating’

From the columns mentioned above, we are going to select review_text, rating and product title for prompt engineering for the LLM.

Installation

pip install -q groq pandas

Topic Extraction and Mentions from Barbie Reviews:

Let’s take a look at how we can extract topics and mentions from those topics using LLM powered by Groq.

from groq import Groq
import json
import pandas as pd

# Load the data
df = pd.read_parquet(“barbie_reviews.parquet”)

# Initialize groq api
groq_client = Groq(
 api_key= “YOUR_API_KEY”, max_retries= 3
)

# Sample data
index = 7
review_text = df.review_text[index]
product_title = df.product_title[index]
rating = df.average_rating[index]

Review : “Very beautiful doll with a lovely skin tone. The only thing I don’t like about this doll is that she has false eyelashes, they look so out of place and take away from her beauty. Don’t show her with those lashes in the pictures, still an amazing doll though.”

Now, let’s prepare the prompt for the LLM .

system = f’’’
Given the following review, extract only main topics, 
direct quote(sentences)of those topics. The topic should represent the broad 
category or comment (one word each).

Output the response in JSON format with topic as key & quote as values 

Product Title: {product_title}
Rating: {rating}
Review: {review_text}
‘’’

Time to call lightning-fast inference engine from Groq.

chat_completion = groq_client.chat.completions.create(
         messages=[
         {
         “role”: “system”,
         “content”: system,
         },
         ],
         model=”llama3–70b-8192",
         temperature= 0.3,
         response_format = {“type”: “json_object”}
         )

answer = json.loads(chat_completion.choices[0].message.content)
print(answer)

Response:

{‘Beauty’: ‘very beautiful doll with a lovely skin tone’,

‘Criticism’: “the only thing i don’t like about this doll is that she has false eyelashes, they look so out of place and take away from her beauty”,

‘Deception’: “doesn’t show her with those lashes on the pictures”,

‘Overall’: ‘still an amazing doll though’}

Conclusion:

In conclusion, extracting keywords and mentions from Amazon product reviews using LLMs opens up a world of possibilities for businesses looking to harness the power of customer feedback. By utilizing Groq’s capabilities, you can efficiently process large volumes of data, identify key insights, and generate structured outputs that drive informed decision-making.

Extract Topic and Mentions from Amazon Product Reviews ft. Groq

Why Extracting Topics and Mentions Matters

GROQ API:

Data:

Written by Amal