Workaround OpenAI’s Token Limit With Chain Types
If you are reading this article, you have encountered the token limits of OpenAI’s GPT-3 models. The limits for various models are provided here.
https://platform.openai.com/docs/models/gpt-3
To overcome this limitation, OpenAI offers a great example with the Summarizing Books with Human Feedback (https://openai.com/blog/summarizing-books/) solution as provided below.
- The original text is divided into sections, and each section is summarized.
- Section summaries are summarized again into higher-level summaries.
- The summarizing process continues until a complete summary is achieved.
I faced a similar issue while working on the article “Creating Meeting Minutes using OpenAI GPT-3 API” because most meeting transcripts exceed 4,000 tokens.
This, like everything, has many different ways to be completed. For now, I am providing three options:
- NLTK (Natural Language Toolkit). Token count using this option does not match OpenAI tokenizer, but the difference is nominal.
- Transformers. Token count using this option matches OpenAI tokenizer.
- Tiktoken. Token count using this option matches OpenAI tokenizer and is faster than Transformers.
NLTK
NLTK is a leading platform for building Python programs to work with human language data. This code was developed using Google Colab, but you can use any IDE of your choice. Some codes are specific to Goole Colab, though.
Prerequisites
The following are prerequisites for this tutorial:
- Python Package:
nltk (Natural Language Toolkit)
- Python Package:
openai
Install Python Packages
pip install openai nltk
Import Python Packages
import platform
import os
import openai
import os
import nltk
nltk.download('punkt')
from nltk.tokenize import word_tokenizeprint('Python: ', platform.python_version())
print('re: ', re.__version__)
print('nltk: ', nltk.__version__)
(Optional) Count the Number of Tokens
OpenAI GPT-3 is limited to 4,001 tokens per request, encompassing both the request (i.e., prompt) and response. We will be determining the number of tokens present in the meeting transcript.
def count_tokens(filename):
with open(filename, 'r') as f:
text = f.read()
tokens = word_tokenize(text)
return len(tokens)
filename = "/content/drive/MyDrive/Colab Notebooks/minutes/data/Round_22_Online_Kickoff_Meeting.txt"
token_count = count_tokens(filename)
print(f"Number of tokens: {token_count}")
(Optional for the second block of code) Break up the Meeting Transcript into chunks of 2,000 tokens with an overlap of 100 tokens
We will be dividing the meeting transcript into segments of 2,000 tokens, with an overlap of 100 tokens to avoid losing any information from the split.
def break_up_file(tokens, chunk_size, overlap_size):
if len(tokens) <= chunk_size:
yield tokens
else:
chunk = tokens[:chunk_size]
yield chunk
yield from break_up_file(tokens[chunk_size-overlap_size:], chunk_size, overlap_size)
def break_up_file_to_chunks(filename, chunk_size=2000, overlap_size=100):
with open(filename, 'r') as f:
text = f.read()
tokens = word_tokenize(text)
return list(break_up_file(tokens, chunk_size, overlap_size))
filename = "/content/drive/MyDrive/Colab Notebooks/minutes/data/Round_22_Online_Kickoff_Meeting.txt"
chunks = break_up_file_to_chunks(filename)
for i, chunk in enumerate(chunks):
print(f"Chunk {i}: {len(chunk)} tokens")
Function to Convert the NLTK Tokenized Text to Non-Tokenized Text
We will need to convert the tokenized text from NLTK to non-tokenized text, as the OpenAI GPT-3 API does not handle tokenized text very well, which can result in a higher token count exceeding 2,000.
def convert_to_detokenized_text(tokenized_text):
prompt_text = " ".join(tokenized_text)
prompt_text = prompt_text.replace(" 's", "'s")
return detokenized_text
Set OpenAI API Key
Set an environment variable called “OPEN_API_KEY” and assign a secret API key from OpenAI (https://beta.openai.com/account/api-keys).
os.environ["OPENAI_API_KEY"] = 'your openai api key'
openai.api_key = os.getenv("OPENAI_API_KEY")
Summarize the Meeting Transcript one chunk (of 2,000 tokens) at a time
Option 1: model=”text-davinci-003"
filename = "/content/drive/MyDrive/Colab Notebooks/minutes/data/Round_22_Online_Kickoff_Meeting.txt"
prompt_response = []
chunks = break_up_file_to_chunks(filename)for i, chunk in enumerate(chunks):
prompt_request = "Summarize this meeting transcript: " + convert_to_detokenized_text(chunks[i])
response = openai.Completion.create(
model="text-davinci-003",
prompt=prompt_request,
temperature=.5,
max_tokens=500,
top_p=1,
frequency_penalty=0,
presence_penalty=0
)
prompt_response.append(response["choices"][0]["text"].strip())
Option 2: model=”gpt-3.5-turbo”
filename = "/content/drive/MyDrive/Colab Notebooks/minutes/data/Round_22_Online_Kickoff_Meeting.txt"
prompt_response = []
chunks = break_up_file_to_chunks(filename)for i, chunk in enumerate(chunks): prompt_request = "Summarize this meeting transcript: " + convert_to_prompt_text(chunks[i])
messages = [{"role": "system", "content": "This is text summarization."}]
messages.append({"role": "user", "content": prompt_request}) response = openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=messages,
temperature=.5,
max_tokens=500,
top_p=1,
frequency_penalty=0,
presence_penalty=0
)
prompt_response.append(response["choices"][0]["message"]['content'].strip())
Consolidate the Meeting Transcript Summaries
prompt_request = "Consoloidate these meeting summaries: " + str(prompt_response)
response = openai.Completion.create(
model="text-davinci-003",
prompt=prompt_request,
temperature=.5,
max_tokens=1000,
top_p=1,
frequency_penalty=0,
presence_penalty=0
)
meeting_summary = response["choices"][0]["text"].strip()
print(meeting_summary)
Transformers
Transformers provides APIs and tools to easily download and train state-of-the-art pretrained models. This code was developed using Google Colab, but you can use any IDE of your choice. Some codes are specific to Goole Colab, though.
Prerequisites
The following are prerequisites for this tutorial:
- Python Package:
torch
- Python Package:
transformers
- Python Package:
openai
Install Python Packages
%%writefile requirements.txt
openai
torch
transformers
%pip install -r requirements.txt
Import Python Packages
import platform
import os
import openai
import torch
import transformers
from transformers import AutoTokenizerprint('Python: ', platform.python_version())
print('re: ', re.__version__)
print('torch: ', torch.__version__)
print('transformers: ', transformers.__version__)
(Optional) Count the Number of Tokens
OpenAI GPT-3 is limited to 4,001 tokens per request, encompassing both the request (i.e., prompt) and response. We will be determining the number of tokens present in the meeting transcript.
def count_tokens(filename):
tokenizer = AutoTokenizer.from_pretrained("gpt2")
with open(filename, 'r') as f:
text = f.read()
input_ids = torch.tensor(tokenizer.encode(text)).unsqueeze(0)
num_tokens = input_ids.shape[1]
return num_tokens
filename = "/content/drive/MyDrive/Colab Notebooks/minutes/data/Round_22_Online_Kickoff_Meeting.txt"
token_count = count_tokens(filename)
print(f"Number of tokens: {token_count}")
(Optional for the second block of code) Break up text into chunks of 2000 tokens with an overlap of 100 tokens
We will be breaking up the text into chunks of 2,000 tokens with an overlapping 100 tokens to ensure any information is not lost from breaking up the text.
def break_up_file_to_chunks(filename, chunk_size=2000, overlap=100):
tokenizer = AutoTokenizer.from_pretrained("gpt2")
with open(filename, 'r') as f:
text = f.read()
tokens = tokenizer.encode(text)
num_tokens = len(tokens)
chunks = []
for i in range(0, num_tokens, chunk_size - overlap):
chunk = tokens[i:i + chunk_size]
chunks.append(chunk)
return chunks
filename = "/content/drive/MyDrive/Colab Notebooks/minutes/data/Round_22_Online_Kickoff_Meeting.txt"
chunks = break_up_file_to_chunks(filename)
for i, chunk in enumerate(chunks):
print(f"Chunk {i}: {len(chunk)} tokens")
Set OpenAI API Key
Set an environment variable called “OPEN_API_KEY” and assign a secret API key from OpenAI (https://beta.openai.com/account/api-keys).
os.environ["OPENAI_API_KEY"] = 'paste your openai api key here'
openai.api_key = os.getenv("OPENAI_API_KEY")
Summarize the text one chunk at a time
Option 1: model=”text-davinci-003"
filename = "/content/drive/MyDrive/Colab Notebooks/minutes/data/Round_22_Online_Kickoff_Meeting.txt"
prompt_response = []tokenizer = AutoTokenizer.from_pretrained("gpt2")chunks = break_up_file_to_chunks(filename)for i, chunk in enumerate(chunks): prompt_request = "Summarize this meeting transcript: " + tokenizer.decode(chunks[i])
response = openai.Completion.create(
model="text-davinci-003",
prompt=prompt_request,
temperature=.5,
max_tokens=500,
top_p=1,
frequency_penalty=0,
presence_penalty=0
)
prompt_response.append(response["choices"][0]["text"].strip())
Option 2: model=”gpt-3.5-turbo”
filename = "/content/drive/MyDrive/Colab Notebooks/minutes/data/Round_22_Online_Kickoff_Meeting.txt"
prompt_response = []tokenizer = AutoTokenizer.from_pretrained("gpt2")chunks = break_up_file_to_chunks(filename)for i, chunk in enumerate(chunks): prompt_request = "Summarize this meeting transcript: " + tokenizer.decode(chunks[i]) messages = [{"role": "system", "content": "This is text summarization."}]
messages.append({"role": "user", "content": prompt_request}) response = openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=messages,
temperature=.5,
max_tokens=500,
top_p=1,
frequency_penalty=0,
presence_penalty=0
)
prompt_response.append(response["choices"][0]["message"]['content'].strip())
Consolidate the summaries
prompt_request = "Consoloidate these meeting summaries: " + str(prompt_response)
response = openai.Completion.create(
model="text-davinci-003",
prompt=prompt_request,
temperature=.5,
max_tokens=1000,
top_p=1,
frequency_penalty=0,
presence_penalty=0
)
Summary of summaries
meeting_summary = response["choices"][0]["text"].strip()
print(meeting_summary)
Tiktoken
tiktoken is a fast BPE tokeniser for use with OpenAI’s models. This code was developed using Google Colab, but you can use any IDE of your choice. Some codes are specific to Goole Colab, though.
Prerequisites
The following are prerequisites for this tutorial:
- Python Package:
tiktoken
- Python Package:
openai
Install Python Packages
%%writefile requirements.txt
openai
tiktoken
%pip install -r requirements.txt
Import Python Packages
import platform
import os
import openai
import tiktoken
print('Python: ', platform.python_version())
(Optional) Count the Number of Tokens
OpenAI GPT-3 is limited to 4,001 tokens per request, encompassing both the request (i.e., prompt) and response. We will be determining the number of tokens present in the meeting transcript.
def count_tokens(filename):
encoding = tiktoken.get_encoding("gpt2")
with open(filename, 'r') as f:
text = f.read()
input_ids = encoding.encode(text)
num_tokens = len(input_ids)
return num_tokens
filename = "/content/drive/MyDrive/Colab Notebooks/minutes/data/Round_22_Online_Kickoff_Meeting.txt"
num_tokens = count_tokens(filename=filename)
print("Number of tokens: ", num_tokens)
(Optional for the second block of code) Break up text into chunks of 2000 tokens with an overlap of 100 tokens
We will be breaking up the text into chunks of 2,000 tokens with an overlapping 100 tokens to ensure any information is not lost from breaking up the text.
def break_up_file_to_chunks(filename, chunk_size=2000, overlap=100):
encoding = tiktoken.get_encoding("gpt2")
with open(filename, 'r') as f:
text = f.read() tokens = encoding.encode(text)
num_tokens = len(tokens)
chunks = []
for i in range(0, num_tokens, chunk_size - overlap):
chunk = tokens[i:i + chunk_size]
chunks.append(chunk)
return chunks
filename = "/content/drive/MyDrive/Colab Notebooks/minutes/data/Round_22_Online_Kickoff_Meeting.txt"
chunks = break_up_file_to_chunks(filename)
for i, chunk in enumerate(chunks):
print(f"Chunk {i}: {len(chunk)} tokens")
Set OpenAI API Key
Set an environment variable called “OPEN_API_KEY” and assign a secret API key from OpenAI (https://beta.openai.com/account/api-keys).
os.environ["OPENAI_API_KEY"] = 'paste your openai api key here'
openai.api_key = os.getenv("OPENAI_API_KEY")
Summarize the text one chunk at a time
Option 1: model=”text-davinci-003"
filename = "/content/drive/MyDrive/Colab Notebooks/minutes/data/Round_22_Online_Kickoff_Meeting.txt"
prompt_response = []encoding = tiktoken.get_encoding("gpt2")
chunks = break_up_file_to_chunks(filename)for i, chunk in enumerate(chunks):
prompt_request = "Summarize this meeting transcript: " + encoding.decode(chunks[i]) response = openai.Completion.create(
model="text-davinci-003",
prompt=prompt_request,
temperature=.5,
max_tokens=500,
top_p=1,
frequency_penalty=0,
presence_penalty=0
)
prompt_response.append(response["choices"][0]["text"].strip())
Option 2: model=”gpt-3.5-turbo”
filename = "/content/drive/MyDrive/Colab Notebooks/minutes/data/Round_22_Online_Kickoff_Meeting.txt"
prompt_response = []encoding = tiktoken.get_encoding("gpt2")
chunks = break_up_file_to_chunks(filename)for i, chunk in enumerate(chunks): prompt_request = "Summarize this meeting transcript: " + encoding.decode(chunks[i])
messages = [{"role": "system", "content": "This is text summarization."}]
messages.append({"role": "user", "content": prompt_request}) response = openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=messages,
temperature=.5,
max_tokens=500,
top_p=1,
frequency_penalty=0,
presence_penalty=0
)
prompt_response.append(response["choices"][0]["message"]['content'].strip())
Consolidate the summaries
prompt_request = "Consoloidate these meeting summaries: " + str(prompt_response)
response = openai.Completion.create(
model="text-davinci-003",
prompt=prompt_request,
temperature=.5,
max_tokens=1000,
top_p=1,
frequency_penalty=0,
presence_penalty=0
)
Summary of Summaries
meeting_summary = response["choices"][0]["text"].strip()
print(meeting_summary)
I hope you have enjoyed this tutorial. If you have any questions or comments, please provide them here.