Fine Tune GPT-3 model with provided training and validation dataset
In this tutorial, we will build our own fine-tuned GPT-3 model with provided training and validation dataset. It’s doing text-summarization.
1. Getting OpenAI API key
Go to https://platform.openai.com/account/api-keys and log in. Generate new secret key and keep it safely.
2. Checking the prepared training and validation data. It should include a number of pairs of prompt and completion.
{"prompt": "Summarize the following text:Our new business production totaled $389 million of direct PVP exceeded by $75 million -- the direct PVP we produced in every year but once since 2010.\n....", "completion": "Turning to our fourth quarter 2020 results, adjusted operating income was $56 million or $0.69 per share compared with $87 million or $0.90 per share in the fourth quarter of 2019."}
{"prompt": "Summarize the following text:Overall, restaurant traffic has largely stabilized at about 5% below pre-pandemic levels led by the continued solid performance at quick service restaurants.\nDemand in U.S. retail channels also remained solid with overall category volumes in the quarter still up 15% to 20% from pre-pandemic levels....", "completion": "Specifically in the quarter, sales increased 13% to $984 million, with volume up 11% and price mix up 2%.\nDiluted earnings per share in the first quarter was $0.20, down from $0.61 in the prior year, while adjusted EBITDA including joint ventures was $123 million, down from $202 million."}
........................
3. Let’s get started
First, we need to install the OpenAI library:
!pip install --upgrade openai
import os
import openai
os.environ["OPENAI_API_KEY"] = 'YOUR_API_KEY'
We will pass GPT-3 model as parameters. Such as ada, curie, or davinci
!openai api fine_tunes.create -t "prepared_train.jsonl" -v "prepared_val.jsonl" -m ada
It takes time to put fine-tune job in queue and train a model. Let’s wait and check the status
!openai api fine_tunes.follow -i ft-3Wnb4hOrXU1FuQGDRfyvNWlz
After the fine-tuned model is created, wee can test our function. We will use the following prompt:
Summarize the following text:During the first quarter, we maintained a very safe environment with an RIR of 0.64, which is in line with our full year 2020 performance.
Use command line
!openai api completions.create -m ada:ft-tpisoftware-2023-03-01-00-10-20 -p "Summarize the following text:During the first quarter, we maintained a very safe environment with an RIR of 0.64, which is in line with our full year 2020 performance."
Or python code
openai.api_key = os.getenv("OPENAI_API_KEY")
openai.Completion.create(
model="ada:ft-tpisoftware-2023-03-01-00-10-20",
prompt="Summarize the following text:During the first quarter, we maintained a very safe environment with an RIR of 0.64, which is in line with our full year 2020 performance.",
max_tokens=256,
temperature=0
)
Sample result:
<OpenAIObject text_completion id=cmpl-6p4x0eQuRB83JEWs19ssaSS5g7TKC at 0x1011ea4a0> JSON: {
"choices": [
{
"finish_reason": "length",
"index": 0,
"logprobs": null,
"text": "\nWe have been in the business of providing our customers with the best quality products and services for over 40 years."
}
],
"created": 1677631778,
"id": "cmpl-6p4x0eQuRB83JEWs19ssaSS5g7TKC",
"model": "ada:ft-tpisoftware-2023-03-01-00-10-20",
"object": "text_completion",
"usage": {
"completion_tokens": 256,
"prompt_tokens": 17,
"total_tokens": 273
}
}
4. Reference
Chuangtc: https://chuangtc.com/projects/fine-tune-GPT3.php