Fine Tune GPT-3 model with provided training and validation dataset

Jason TC Chuang
aidatatools
Published in
3 min readMar 30, 2023

In this tutorial, we will build our own fine-tuned GPT-3 model with provided training and validation dataset. It’s doing text-summarization.

1. Getting OpenAI API key

Go to https://platform.openai.com/account/api-keys and log in. Generate new secret key and keep it safely.

OpenAI API key

2. Checking the prepared training and validation data. It should include a number of pairs of prompt and completion.

{"prompt": "Summarize the following text:Our new business production totaled $389 million of direct PVP exceeded by $75 million -- the direct PVP we produced in every year but once since 2010.\n....", "completion": "Turning to our fourth quarter 2020 results, adjusted operating income was $56 million or $0.69 per share compared with $87 million or $0.90 per share in the fourth quarter of 2019."}
{"prompt": "Summarize the following text:Overall, restaurant traffic has largely stabilized at about 5% below pre-pandemic levels led by the continued solid performance at quick service restaurants.\nDemand in U.S. retail channels also remained solid with overall category volumes in the quarter still up 15% to 20% from pre-pandemic levels....", "completion": "Specifically in the quarter, sales increased 13% to $984 million, with volume up 11% and price mix up 2%.\nDiluted earnings per share in the first quarter was $0.20, down from $0.61 in the prior year, while adjusted EBITDA including joint ventures was $123 million, down from $202 million."}
........................

3. Let’s get started

First, we need to install the OpenAI library:

!pip install --upgrade openai
import os
import openai
os.environ["OPENAI_API_KEY"] = 'YOUR_API_KEY'

We will pass GPT-3 model as parameters. Such as ada, curie, or davinci

!openai api fine_tunes.create -t "prepared_train.jsonl" -v "prepared_val.jsonl" -m ada

It takes time to put fine-tune job in queue and train a model. Let’s wait and check the status

!openai api fine_tunes.follow -i ft-3Wnb4hOrXU1FuQGDRfyvNWlz

After the fine-tuned model is created, wee can test our function. We will use the following prompt:

Summarize the following text:During the first quarter, we maintained a very safe environment with an RIR of 0.64, which is in line with our full year 2020 performance.

Use command line

!openai api completions.create -m ada:ft-tpisoftware-2023-03-01-00-10-20 -p "Summarize the following text:During the first quarter, we maintained a very safe environment with an RIR of 0.64, which is in line with our full year 2020 performance."

Or python code

openai.api_key = os.getenv("OPENAI_API_KEY")
openai.Completion.create(
model="ada:ft-tpisoftware-2023-03-01-00-10-20",
prompt="Summarize the following text:During the first quarter, we maintained a very safe environment with an RIR of 0.64, which is in line with our full year 2020 performance.",
max_tokens=256,
temperature=0
)

Sample result:

<OpenAIObject text_completion id=cmpl-6p4x0eQuRB83JEWs19ssaSS5g7TKC at 0x1011ea4a0> JSON: {
"choices": [
{
"finish_reason": "length",
"index": 0,
"logprobs": null,
"text": "\nWe have been in the business of providing our customers with the best quality products and services for over 40 years."
}
],
"created": 1677631778,
"id": "cmpl-6p4x0eQuRB83JEWs19ssaSS5g7TKC",
"model": "ada:ft-tpisoftware-2023-03-01-00-10-20",
"object": "text_completion",
"usage": {
"completion_tokens": 256,
"prompt_tokens": 17,
"total_tokens": 273
}
}

4. Reference

Chuangtc: https://chuangtc.com/projects/fine-tune-GPT3.php

Github: https://github.com/chuangtc/ECTSum-GPT3

OpenAI: https://platform.openai.com/docs/guides/fine-tuning

--

--

Jason TC Chuang
aidatatools

Google Certified Professional Data Engineer. He holds a PhD from Purdue University. He loves solving real-world problems and building better tools with ML/AI.