Part 3 — Summarizing/Extracting Information

Rishi Khandelwal
8 min readMay 26, 2023

--

Welcome to the third part of this series, if you haven't read the first two blogs, go ahead and read them here

When we get tons of text on the internet for almost everything, it becomes overwhelming to read everything and understand. Summarizing text is amongst the most exciting applications of LLMs.

With this blog, we will see how to write prompts to summarize effectively, focusing on the information that we care about the most and sometimes even neglecting all the other noise around text.

Let’s see some examples with code.

Note — for setup code, please read the 1st part here.

In this example, we are going to summarize a product review which is written below. It’s a very common use case as e-commerce websites are filled with product reviews. While no one has the time to go through all the reviews, it is a good idea to give this job to a LLM and ask it to summarize everything for us. With this solution, product owners can have better understanding of how their product is being received by the customers, what features they like the most and what are customers’ pain points.

prod_review = """
Got this panda plush toy for my daughter's birthday, \
who loves it and takes it everywhere. It's soft and \
super cute, and its face has a friendly look. It's \
a bit small for what I paid though. I think there \
might be other options that are bigger for the \
same price. It arrived a day earlier than expected, \
so I got to play with it myself before I gave it \
to her.
"""

Let’s start by giving the following prompt and see the results.

prompt = f"""
Your task is to generate a short summary of a product \
review from an ecommerce site.

Summarize the review below, delimited by triple
backticks, in at most 30 words.

Review: ```{prod_review}```
"""

response = get_completion(prompt)
print(response)

Output

Soft and cute panda plush toy loved by daughter, but a bit small for the 
price. Arrived early.

Not bad, it’s a pretty good summary. In part two of this series, we learnt that we can also play with character count or the number of sentences to affect the length of the summary. Also, sometimes if we have a very specific use case such as getting feedback for the shipment department, we can modify the prompt accordingly to reflect that.

So, Let’s modify the prompt and do this. I will add two things in the new prompt, first is the word count and second is “focus on shipping”.

prompt = f"""
Your task is to generate a short summary of a product \
review from an ecommerce site to give feedback to the \
Shipping deparmtment.

Summarize the review below, delimited by triple
backticks, in at most 30 words, and focusing on any aspects \
that mention shipping and delivery of the product.

Review: ```{prod_review}```
"""

response = get_completion(prompt)
print(response)

Output

The panda plush toy arrived a day earlier than expected, but the customer felt
it was a bit small for the price paid.

Here, instead of starting off with “Soft and Cute Panda Plush Toy”, it now focuses on the fact that it arrived a day earlier than expected.

Similarly, Let’s now modify the prompt to get feedback for the pricing department. this will definitely be going to help them in understanding the value of the product w.r.t the price.

prompt = f"""
Your task is to generate a short summary of a product \
review from an ecommerce site to give feedback to the \
pricing deparmtment, responsible for determining the \
price of the product.

Summarize the review below, delimited by triple
backticks, in at most 30 words, and focusing on any aspects \
that are relevant to the price and perceived value.

Review: ```{prod_review}```
"""

response = get_completion(prompt)
print(response)

Output

The panda plush toy is soft, cute, and loved by the recipient, but the price 
may be too high for its size.

This summary indicates that, maybe the price is too high for the size of the product.

Now, in the summaries that we have generated for the shipping department or the pricing department, it focusses a bit more on information relevant to those specific departments.

But in these summaries, even though it generated the information relevant to shipping, it had some other information too, which may or may not be helpful. So, depending on how we want to summarize it, we can also ask it to extract information rather than summarize it.

Let’s see with a different prompt, how exactly using the word “extract” in place of “summarize” will make all the difference.

prompt = f"""
Your task is to extract relevant information from \
a product review from an ecommerce site to give \
feedback to the Shipping department.

From the review below, delimited by triple quotes \
extract the information relevant to shipping and \
delivery. Limit to 30 words.

Review: ```{prod_review}```
"""

response = get_completion(prompt)
print(response)

Output

"The product arrived a day earlier than expected."

Here, it just extracts the information strictly related to the shipping and removes all the other information such as price, size, etc.

Lastly, let’s see a concrete example for how to use this in a workflow to help summarize multiple reviews to make them easier to read.

So, here are a few reviews. These are quite long and are for completely different things such as a standing Lamp, electric toothbrush, soft toy, etc.

review_1 = prod_review 

# review for a standing lamp
review_2 = """
Needed a nice lamp for my bedroom, and this one \
had additional storage and not too high of a price \
point. Got it fast - arrived in 2 days. The string \
to the lamp broke during the transit and the company \
happily sent over a new one. Came within a few days \
as well. It was easy to put together. Then I had a \
missing part, so I contacted their support and they \
very quickly got me the missing piece! Seems to me \
to be a great company that cares about their customers \
and products.
"""

# review for an electric toothbrush
review_3 = """
My dental hygienist recommended an electric toothbrush, \
which is why I got this. The battery life seems to be \
pretty impressive so far. After initial charging and \
leaving the charger plugged in for the first week to \
condition the battery, I've unplugged the charger and \
been using it for twice daily brushing for the last \
3 weeks all on the same charge. But the toothbrush head \
is too small. I’ve seen baby toothbrushes bigger than \
this one. I wish the head was bigger with different \
length bristles to get between teeth better because \
this one doesn’t. Overall if you can get this one \
around the $50 mark, it's a good deal. The manufactuer's \
replacements heads are pretty expensive, but you can \
get generic ones that're more reasonably priced. This \
toothbrush makes me feel like I've been to the dentist \
every day. My teeth feel sparkly clean!
"""

# review for a blender
review_4 = """
So, they still had the 17 piece system on seasonal \
sale for around $49 in the month of November, about \
half off, but for some reason (call it price gouging) \
around the second week of December the prices all went \
up to about anywhere from between $70-$89 for the same \
system. And the 11 piece system went up around $10 or \
so in price also from the earlier sale price of $29. \
So it looks okay, but if you look at the base, the part \
where the blade locks into place doesn’t look as good \
as in previous editions from a few years ago, but I \
plan to be very gentle with it (example, I crush \
very hard items like beans, ice, rice, etc. in the \
blender first then pulverize them in the serving size \
I want in the blender then switch to the whipping \
blade for a finer flour, and use the cross cutting blade \
first when making smoothies, then use the flat blade \
if I need them finer/less pulpy). Special tip when making \
smoothies, finely cut and freeze the fruits and \
vegetables (if using spinach-lightly stew soften the \
spinach then freeze until ready for use-and if making \
sorbet, use a small to medium sized food processor) \
that you plan to use that way you can avoid adding so \
much ice if at all-when making your smoothie. \
After about a year, the motor was making a funny noise. \
I called customer service but the warranty expired \
already, so I had to buy another one. FYI: The overall \
quality has gone done in these types of products, so \
they are kind of counting on brand recognition and \
consumer loyalty to maintain sales. Got it in about \
two days.
"""

reviews = [review_1, review_2, review_3, review_4]

All these reviews are put into a list. below is the code to iterate over the list, to summarize the reviews one by one.

import time
for review in reviews:
prompt = f"""
Your task is to generate a short summary of a product \
review from an ecommerce site.

Summarize the review below, delimited by triple \
backticks in at most 20 words.

Review: ```{review}```
"""

response = get_completion(prompt)
print(response, "\n")
time.sleep(25)

Output

Soft and cute panda plush toy loved by daughter, but a bit small for the
price.Arrived early.

Affordable lamp with storage, fast shipping, and excellent customer service.
Easy to assemble and missing parts were quickly replaced.

Good battery life, small toothbrush head, but effective cleaning. Good deal
if bought around $50.

The product was on sale for $49 in November, but the price increased to
$70-$89 in December. The base doesn't look as good as previous editions, but
the reviewer plans to be gentle with it. A special tip for making smoothies
is to freeze the fruits and vegetables beforehand. The motor made a funny
noise after a year, and the warranty had expired. Overall quality has
decreased.

So, if you have a website where you have hundreds of reviews, you can imagine how you might use this to build a dashboard to take huge numbers of reviews, generate short summaries of them so that anyone can browse the reviews much more quickly. Also, if someone wish to read the original review, they can just click on the short summary to get to the original review.

This can help different stake holders efficiently get feedback on their products and services.

With that, I will wrap up the topic of summarizing, hope you got a good sense of writing prompts to summarize and extract useful information using an LLM.

Next on the list we will look at another capability of LLMs, which is to make inferences using text. For example, what if you had, again, product reviews and you wanted to very quickly get a sense of which product reviews have a positive or a negative sentiment? Let’s take a look at how to do that in the next blog.

--

--