Implementing text summarization using OpenAI’s GPT-3 API.

AI & Insights
AI & Insights
Published in
3 min readMar 2, 2023

--

Let’s explore how to implement text summarization using OpenAI’s GPT-3 API.

Text summarization is the process of distilling the most important information from a text document into a shorter, more concise summary. It’s a useful technique for condensing large volumes of information into easily digestible formats, such as news articles, research papers, and business reports.

There are two main approaches to text summarization: extractive summarization and abstractive summarization. Extractive summarization involves selecting the most important sentences or phrases from the original text and concatenating them to form a summary, while abstractive summarization involves generating new sentences that capture the essence of the original text.

We’ll focus on abstractive summarization using OpenAI’s GPT-3 API.

Photo by Kristyna Squared.one on Unsplash

Prerequisites

To follow along with this post, you’ll need to have an OpenAI API key and a basic understanding of Python programming.

Implementing Text Summarization with OpenAI API

To implement text summarization using OpenAI API, you’ll need to follow these steps:

  1. Set up your OpenAI API credentials
  2. Load your text data
  3. Preprocess your text data
  4. Generate summaries using OpenAI’s GPT-3 API

Let’s walk through each step in more detail.

Step 1: Set up your OpenAI API credentials

To use OpenAI’s GPT-3 API, you’ll need to set up your API credentials. You can follow the instructions on the OpenAI website to sign up for an API key.

Once you have your API key, you can install the openai Python package and set your API key as an environment variable, like this:

import openai_secret_manager

assert "openai" in openai_secret_manager.get_services()
secrets = openai_secret_manager.get_secret("openai")

openai.api_key = secrets["api_key"]

Step 2: Load your text data

Next, you’ll need to load your text data into Python. You can use any method you prefer for loading data, such as reading from a file or scraping from a website.

For this example, we’ll load a sample article from the BBC News website:

import requests

url = "https://www.bbc.com/news/world-us-canada-61685845"
response = requests.get(url)
text = response.text

Step 3: Preprocess your text data

Before you can generate summaries using OpenAI’s GPT-3 API, you’ll need to preprocess your text data to ensure it’s in the correct format. Specifically, you’ll need to split your text into smaller chunks that are suitable for input into the API.

You can use the split_text function below to split your text into smaller chunks of up to 2048 characters each:

def split_text(text):
max_chunk_size = 2048
chunks = []
current_chunk = ""
for sentence in text.split("."):
if len(current_chunk) + len(sentence) < max_chunk_size:
current_chunk += sentence + "."
else:
chunks.append(current_chunk.strip())
current_chunk = sentence + "."
if current_chunk:
chunks.append(current_chunk.strip())
return chunks

Step 4: Generate summaries using OpenAI’s GPT-3 API

Finally, you can use OpenAI’s GPT-3 API to generate summaries of your text data. To do this, you’ll need to use the openai.Completion.create function to generate text completions based on the input text you’ve preprocessed.

Here’s an example function that uses OpenAI’s GPT-3 API to generate a summary of a given input text:

def generate_summary(text):
input_chunks = split_text(text)
output_chunks = []
for chunk in input_chunks:
response = openai.Completion.create(
engine="davinci",
prompt=(f"Please summarize the following text:\n{chunk}\n\nSummary:"),
temperature=0.5,
max_tokens=1024,
n = 1,
stop=None
)
summary = response.choices[0].text.strip()
output_chunks.append(summary)
return " ".join(output_chunks)

In this function, we split the input text into smaller chunks using the split_text function from Step 3. Then, for each chunk, we use the openai.Completion.create function to generate a summary based on a prompt that includes the chunk and a request for a summary. We use the temperature parameter to control the creativity of the summary, the max_tokens parameter to control the length of the summary, and the n parameter to control the number of summaries generated for each input chunk. Finally, we concatenate the output summaries into a single string using the join function.

In this post, we’ve explored how to implement text summarization using OpenAI’s GPT-3 API. We covered the steps involved in setting up your API credentials, loading your text data, preprocessing your text data, and generating summaries using OpenAI’s API. We also discussed some best practices for text summarization and potential applications of this technique.

Try experimenting with different text inputs and tweaking the parameters of the generate_summary function to see how you can improve the quality of your summaries. Good luck!

--

--

AI & Insights
AI & Insights

Journey into the Future: Exploring the Intersection of Tech and Society