Fine-Tuning GPT Models With OpenAI APIs — Day 29 of #30daysofAI
Earlier this week I created an interactive Discord chatbot that leveraged OpenAI APIs to give users random cat facts. I decided to see how well I could improve the chatbot conversations by using OpenAI’s ability to fine-tune the ML model being used in my API calls. Here’s what I discovered.
For those missing context, this is related to a blog series called 30 Days of AI, where I challenged myself to spend the next 30 days learning more about AI.
Background
As mentioned above, at this point I already have a working chatbot wired up to OpenAI APIs to make use of the fine tuning. For anyone curious how that was put together, see my previous article here.
My chatbot was originally intended to give facts about cats (of the feline variety), however I decided to change this up a bit to make the fine-tuning exercise a bit more informative. Instead of giving facts about the cute fuzzy animals what if I instead created a company called “Creative Assistance Technology” (aka CAT) and wanted the chatbot to give random facts about the company? Since this company does not exist in the real world, that means GPT-3 (the model I’m using in the API calls) would have very little relevant training model data to draw from. This scenario gives a better real world example of where fine-tuning may be useful.
Updating Catbot Prompts
Before getting started with fine-tuning I updated the code from my catbot Python service to do some better prompting based on my new targeted scenario (i.e. getting facts about “Creative Assistance Technology”). Here’s the small code change I made:
@bot.command(name='fact')
async def cat(ctx):
# Ask GPT-3.5-turbo for a random cat fact using the chat model
response = openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=[
{"role": "system", "content": "You are an expert on Creative Assistance Technology (CAT). You will only give me responses related to Creative Assistance Technology."},
{"role": "user", "content": "Tell me a random cat fact."}
]
)
print(response)
cat_fact = response.choices[0].message['content']
await ctx.send(cat_fact)
Here is a sample of the output seen from the bot with the new prompting:
What Is Model Fine-Tuning?
Fine-tuning is the action of improving ML models by adding in more specialized data into the ML model training set. ChatGPT/GPT models are pre-trained on a large amount of data, but they may not have specific relevant data to every use case or recent data beyond their training date (i.e. around 2021 for GPT-3.5 turbo).
Fine-tuning is more costly and complex to support, and involves crafting a dataset of related data. This should be done only after many attempts at prompt optimization and prompt engineering don’t give good results. OpenAI has decent documentation on a few strategies for improving prompts before having to try fine-tuning, which I also recommend reading through.
Fine-Tuning For My Catbot Bot
To do the fine-tuning, I followed the guide in the OpenAI documentation here. I ran into a few different errors going through this as their documentation lacks some detail and only gives non functional code snippets. Here’s a more thorough step by step of what I did to make this work.
Step 1 — Create A Dataset JSON File
For my scenario, I hand-crafted some conversation data related to “Creative Assistance Technology” cats. Since I am doing the fine-tuning on gpt-3.5-turbo I used the conversation format in the JSON file. Here’s what my file looked like:
{"messages": [{"role": "system", "content": "You are an expert on Creative Assistance Technology (CAT). You will only give me responses related to Creative Assistance Technology."}, {"role": "user", "content": "Tell me a random cat fact."}, {"role": "assistant", "content": "Creative Assistance Technology (CAT) has a programming API for premium customers."}]}
{"messages": [{"role": "system", "content": "You are an expert on Creative Assistance Technology (CAT). You will only give me responses related to Creative Assistance Technology."}, {"role": "user", "content": "Tell me a cat fact."}, {"role": "assistant", "content": "Creative Assistance Technology (CAT) can help boost your creativity by providing a helpful assistant in many creative fields."}]}
{"messages": [{"role": "system", "content": "You are an expert on Creative Assistance Technology (CAT). You will only give me responses related to Creative Assistance Technology."}, {"role": "user", "content": "Tell me info about cats."}, {"role": "assistant", "content": "CATs, or Creative Assistance Technologies, are assistant tools created by Creative Assistant Technology to help improve your creative side."}]}
{"messages": [{"role": "system", "content": "You are an expert on Creative Assistance Technology (CAT). You will only give me responses related to Creative Assistance Technology."}, {"role": "user", "content": "Tell me something about cats."}, {"role": "assistant", "content": "CATs, or Creative Assistance Technologies, are assistant tools created by Creative Assistant Technology to help improve your creative side."}]}
{"messages": [{"role": "system", "content": "You are an expert on Creative Assistance Technology (CAT). You will only give me responses related to Creative Assistance Technology."}, {"role": "user", "content": "Give me details about cats."}, {"role": "assistant", "content": "Creative Assistance Technology (CAT) programming API has bindings in Python, Typescript, as well as Java."}]}
{"messages": [{"role": "system", "content": "You are an expert on Creative Assistance Technology (CAT). You will only give me responses related to Creative Assistance Technology."}, {"role": "user", "content": "Tell me cat facts."}, {"role": "assistant", "content": "CATs, or Creative Assistance Technologies, are assistant tools created by Creative Assistant Technology to help improve your creative side."}]}
{"messages": [{"role": "system", "content": "You are an expert on Creative Assistance Technology (CAT). You will only give me responses related to Creative Assistance Technology."}, {"role": "user", "content": "Give me a fact fact."}, {"role": "assistant", "content": "CATs, or Creative Assistance Technologies, are assistant tools created by Creative Assistant Technology to help improve your creative side."}]}
{"messages": [{"role": "system", "content": "You are an expert on Creative Assistance Technology (CAT). You will only give me responses related to Creative Assistance Technology."}, {"role": "user", "content": "Tell me a cat fact."}, {"role": "assistant", "content": "Creative Assistance Technology (CAT) has several tools called cats that make it easy to be creative."}]}
{"messages": [{"role": "system", "content": "You are an expert on Creative Assistance Technology (CAT). You will only give me responses related to Creative Assistance Technology."}, {"role": "user", "content": "Give me cat information."}, {"role": "assistant", "content": "CATs, or Creative Assistance Technologies, are assistant tools created by Creative Assistant Technology to help improve your creative side."}]}
{"messages": [{"role": "system", "content": "You are an expert on Creative Assistance Technology (CAT). You will only give me responses related to Creative Assistance Technology."}, {"role": "user", "content": "I want to hear about cats."}, {"role": "assistant", "content": "CATs, or Creative Assistance Technologies, are assistant tools created by Creative Assistant Technology to help improve your creative side."}]}
Note: You need a minimum of 10 samples in a datafile, otherwise OpenAI will reject the upload request. Otherwise you’ll see an exception like this from the OpenAI APIs:
Step 2 — Upload the Dataset
To do this I created a new Python script that opened my JSON file and uploaded it to OpenAI. Here’s the code snippet:
# a python module that uploads a tune data json file to OpenAI
import openai
from creds import OPENAI_KEY
openai.api_key = OPENAI_KEY
local_file_json = "catdata.jsonl"
print(f'Using local json file {local_file_json} \n')
tune_file = openai.File.create(
file=open(local_file_json, "rb"),
purpose='fine-tune'
)
print(f"Uploaded tune file: {tune_file} \n")
Note (1): Replace OPENAI_KEY with your OpenAI key in case of following along. The creds.py module was just a quick way to abstract out the key string so I can access it in several other scripts.
Step 3 — Wait For File Upload Processing
The OpenAI documentation has several steps occurring in their sample code. I instead split the steps into separate scripts because you can’t actually run all the steps in quick succession. OpenAI takes some time to do processing of each of the steps so you’ll need to wait until the previous step finishes before doing the next step.
If you try to start the tune job (next step) too fast it may fail with the below error:
I was able to start the tuning job (next step) successfully by waiting a minute or two after doing the dataset file upload.
Step 4 — Start The Tune Job
To start the tune job I created the below script:
# a python module that fine tunes an OpenAI GPT-3 model
import openai
from creds import OPENAI_KEY
openai.api_key = OPENAI_KEY
tune_file_id = 'file-1234a'
print(f"Using tune file id: {tune_file_id} \n")
tune_job = openai.FineTuningJob.create(training_file=tune_file_id, model="gpt-3.5-turbo")
print(f"Created tune job: {tune_job} \n")
# List up to 10 events from a fine-tuning job
recent_events = openai.FineTuningJob.list_events(id=tune_job['id'], limit=10)
print(f"Recent events from new job: {recent_events} \n")
Note: replace the tune_file_id with the id that gets printed out in step 2
Step 5 — Wait For Tune Job To Complete
Depending on how many data samples uploaded, the tune job may need several minutes up to several hours to complete. To check the status of the tune job I created another small script:
# a python module that fine tunes an OpenAI GPT-3 model
import openai
from creds import OPENAI_KEY
openai.api_key = OPENAI_KEY
tune_job_id = 'ftjob-1234l'
# List 10 fine-tuning jobs
all_tune_jobs = openai.FineTuningJob.list(limit=10)
print(f"Existing tuning jobs: {all_tune_jobs} \n")
# List up to 10 events from a fine-tuning job
recent_events = openai.FineTuningJob.list_events(id=tune_job_id, limit=10)
print(f"Recent events from new job {tune_job_id}: {recent_events} \n")
Note: replace the tune_job_id with the id that gets printed out in Step 2
Step 6 — Update OpenAI Queries With New Model
Once the fine-tune job completes the updated model should be ready to use. To use the new model all that is needed is to change the model name in the OpenAI query using the format:
model=”ft:gpt-3.5-turbo:my-org:custom_suffix:id”
This model name should also be present in the prints from doing Step 5 (when running the tune job check script).
Here’s what my bot code looked like after my change:
response = openai.ChatCompletion.create(
# model="gpt-3.5-turbo",
model = "ft:gpt-3.5-turbo-0613:personal::8yY1Kwln",
messages=[
{"role": "system", "content": "You are an expert on Creative Assistance Technology (CAT). You will only give me responses related to Creative Assistance Technology."},
{"role": "user", "content": "Tell me a random cat fact."}
]
)
Catbot Fine-Tuning Results
After updating my catbot to use the new fine-tuned model I tried out a few new queries to the bot. Here’s a sample of what the responses looked like after tuning:
As you can see, the responses now are much more relevant to the scenario I envisioned with all of them being related to “Creative Assistance Technology” instead of generic responses or responses about the fuzzy felines.
Pricing Considerations For Fine-Tuning
Fine-tuning can be expensive as OpenAI charges for the act of training new data to their models, as well as extra to send requests to the fine-tuned models. Here’s a link to the OpenAI’s pricing sheet, but as of today it costs roughly 10x more to send prompt requests to fine-tuned models. Not to mention an extra cost to do the training (which is also roughly 5x the cost of doing a normal model prompt query).
To run my small fine-tune job of 10 dataset samples it costed around $0.05, for reference.
Final Thoughts
My experience attempting fine-tuning with OpenAI ended up being very straightforward and highly successful. Fine-tuning in OpenAI is a powerful feature that does not take that much effort to get set up. The biggest time spend is likely going to be curating the dataset for most real-world use cases. I can certainly imagine lots of potential for real products to saving time by doing fine-tuning on top of GPT-3.5 in OpenAI instead of rolling their own custom ML models and solutions.
<< EOM >>
Have you tried doing fine-tuning with OpenAI APIs and models yet? Drop me a line in the comments and tell me about your experiences.