My Journey of Fine-Tuning GPT-3 with a Custom Dataset

Geethu Suresh
Version 1
Published in
5 min readFeb 24, 2023
Photo by Jonathan Kemper on Unsplash

Riding the Wave of GPT-3 Hype

Open AI’s GPT-3 — wherever you turn, that's what everyone is talking about! The massive language model has taken the AI community by storm and everyone is eager to see what this model is capable of. As an AI enthusiast, I was eager to experience the potential of GPT-3 for myself. So If you haven’t already jumped on the GPT-3 bandwagon, let’s dive into my journey of fine-tuning GPT-3 and see what I learned along the way.

As with any pre-trained model, GPT-3 has some limitations when it came to tackling domain-specific tasks. This is where fine-tuning comes into play. Fine-tuning is all about customizing that pre-trained model to fit your specific needs and make it more domain-specific.

Fine-tuning improves on few-shot learning by training on many more examples than can fit in the prompt, letting you achieve better results on a wide number of tasks. Once a model has been fine-tuned, you won’t need to provide examples in the prompt anymore. This saves costs and enables lower-latency requests.

Elevating the Game

Upon delving into the concept of fine-tuning, I became interested in exploring its potential application as a question-answering system and evaluating the practicality of training GPT-3 using a custom dataset.
Here is the step-by-step procedure I followed for Fine-tuning :

Step 1:
Prepare the custom dataset

I used the information publicly available on the Version 1 website to fine-tune GPT-3.

To suit the requirements of GPT-3, the dataset for fine-tuning should be in a particular JSONL file format. Each line of the file comprises a “prompt” followed by its corresponding “completion”. The format is illustrated below:

{"prompt": "<prompt text>", "completion": "<ideal generated text>"}
{"prompt": "<prompt text>", "completion": "<ideal generated text>"}
{"prompt": "<prompt text>", "completion": "<ideal generated text>"}

I selected some of the data from the website and used Chat GPT to help me with the process of preparing the dataset as shown below:

ChatGPT converting custom data to JSONL format

This is the training dataset used:

The dataset used for Fine-Tuning

Step 2:
Generate the API Key

Navigate to https://platform.openai.com/account/api-keys and select Create new secret key to create the API key, if you don’t have one. This key is used when calling the Open AI endpoints in steps 3 and 4.

Step 3:
Upload the Dataset

Once the dataset was ready, I used Open AI’s Files endpoint to upload the document. The endpoint allows you to upload a file containing document(s) for use across different endpoints/features. The purpose of the uploaded file must be specified, with “fine-tune” being used for fine-tuning.

curl --location --request POST 'https://api.openai.com/v1/files' \
--header 'Authorization: Bearer {API_Key}' \
--form 'file=@"{file_location}"' \
--form 'purpose="fine-tune"'
Using the API key
Uploading the dataset using Files endpoint

Step 4:
Fine-tune the Model

Now that the dataset is uploaded, the next step was to fine-tune the model using the uploaded file. The Fine-Tunes endpoint requires the file id, obtained when uploading the file. I also supplied the model(davinci) and suffix(ada:ft-your-org:my-suffix-date) while calling the API.

curl --location --request POST 'https://api.openai.com/v1/fine-tunes' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer {API_Key}' \
--data-raw '{
"training_file": "{file_id}",
"model":"davinci",
"suffix":"{my-suffix}"
}
'

The time needed to fine-tune the model can vary depending on factors such as the size of the training dataset, task complexity, and computational resources available. Using the API endpoint, saves time and computational resources since Open AI distributes the training across multiple machines.

Step 5:
Test the Model

The steps followed were:

  • Navigate to the Open AI Playground.
  • In the “Model” dropdown menu, select the fine-tuned model you want to test.
  • In the “Prompt” box, enter your question and click the “Submit” button
Testing the model using playground

Although the model answered most of the queries correctly, it did produce confabulations. This occurs because the language model generates responses based on patterns learned from its extensive training corpus, which can sometimes lead to erroneous responses

Sample responses from the model

The Hurdles

While fine-tuning GPT-3 was a fantastic experience, it is not without its challenges. Some of the major challenges I faced were:

  1. Quality of data: Fine-tuning typically requires a substantial amount of high-quality training data, which can be challenging to acquire and clean. Poor quality data can negatively impact the accuracy and performance of the model.
  2. Training Time: GPT-3 is a large and complex language model, and training it on a custom dataset can take a significant amount of time, depending on the size of the data and the computational resources available.
  3. Pricing: Fine-tuning can be expensive, the pricing has two parts: training and usage. During training, the total tokens used are billed according to the training rates. The total training tokens depend on the tokens in the training dataset and the number of training epochs. Once the model is fine-tuned, you’ll only be billed for the tokens used during usage at the usage rates.

Final Verdict

The GPT-3 responses are generated based on the patterns and information present in the dataset it has been trained on. While GPT-3 can be a powerful tool for generating natural language text, it is not a substitute for a comprehensive and structured knowledge base. A true knowledge base typically contains a wide range of interlinked information and data, and is designed to be queried and searched in a more systematic and targeted way.

Thus its important to carefully consider the specific requirements and constraints of the system and evaluate whether fine-tuning GPT-3 is the right approach. Other approaches, such as using a more specialized model or combining multiple models, may be more appropriate in some cases.

In conclusion, fine-tuning GPT-3 with a custom dataset was a good experiment. While there were challenges along the way, the experience was well worth it and gave me a deeper appreciation for the power of NLP and language models.

Geethu Suresh is a Microsoft .Net Consultant at the Version 1 Innovation Labs.

--

--

Geethu Suresh
Version 1

A software engineer who enjoys meaningful conversations over a cup of coffee!