Open AI — Customizing GPT-3 For Your Application

M. Haseeb Hassan
4 min readJul 23, 2022

--

OpenAI provided customized GPT-3 to improve the reliability of output, offering more consistent results that you can count on for production use-cases.
Fine-Tuning GPT-3 and Customizing Model for Use Case

With the launch of OpenAI API, thousands of applications and developers are using GPT-3 and building on the platform of OpenAI. GPT-3 is a miracle in the world of Natural Language Processing. GPT-3 returns a text completion in natural language, given any text input (prompt) like a phrase or a sentence. The GPT-3 can be programmed using a few sets of examples. The OpenAI API is designed to be simple for anyone to use as well as flexible enough to make machine learning teams more productive.

GPT-3 Models and Applications

GPT-3 Models are state of art architectures in NLP. They can understand and generate natural language. OpenAI offers four main models with different levels of power suitable for different tasks. Davinci is the most capable model, and Ada is the fastest. Following are the specifications of different GPT-3 models:

GPT-3 Models Specifications

The Davinci is generally the most capable, whereas the other models can perform certain tasks extremely well with significant speed or cost advantages. For example, Curie can perform many of the same tasks as Davinci, but faster and for 1/10th the cost.

The GPT-3 models can be used for a variety of use cases:

  • Text Completion
  • Code Completion
  • Embeddings
  • Fine Tuning

There are different guides available for each of these use-cases and their sub-applications. To date, hundreds of apps are using GPT-3 across varying categories and industries, from creativity and education to productivity and games. These applications utilize a suite of GPT-3’s diverse capabilities and help OpenAI to evolve.

Customizing/Fine-Tuning GPT-3

Fine-Tuning lets you get more out of the models available through the API by providing:

  1. Higher quality results than prompt design
  2. Token savings due to shorter prompts
  3. Ability to train on more examples than can fit in a prompt
  4. Lower latency requests

GPT-3 has been pre-trained on a vast amount of text from the open internet. There are different kinds of inputs:

  • Zero-shot learning: Direct Input to the model
  • One-shot Learning: Input with one training example
  • Few-shot Learning: Input with a few training examples

The GPT-3 generates a plausible completion. Fine-Tuning is an advanced and improved step on few-shot learning by training on many more examples than can fit in the prompt. It helps to achieve better results on a wide number of tasks. Once a model has been fine-tuned, there won’t be any need to provide examples in the prompt anymore. This becomes a cost-efficient solution with lower-latency requests.

Installation

The use of OpenAI Command Line Interface (CLI) is recommended which can be installed using the simple command:

pip install --upgrade openai

The OpenAI API has an authentication procedure using API KEY. The OPENAI_API_KEY environment variable can be added by the following line into the shell initialization script or by running it in the command line before the fine-tuning command:

export OPENAI_API_KEY="<OPENAI_API_KEY>"

Preparing Training Data

The training data can teach the GPT-3 the relevant context and use case. The data format is JSONL where each line represents an input prompt and an ideal completion. The CLI Preparation Tool can be used to prepare the data in the following format:

{"prompt": "<prompt text>", "completion": "<ideal generated text>"}
{"prompt": "<prompt text>", "completion": "<ideal generated text>"}
{"prompt": "<prompt text>", "completion": "<ideal generated text>"}
...

The prompts used in one-shot learning or few-shot learning are different from fine-tuning. For fine-tuning, each training example consists of a single input prompt and its associated output without detailed instructions that include examples within the same prompt.

CLI Data-Preparation Tool

The OpenAI provides a tool for data preparation. The tool accepts different formats with only requirements: The input must have a prompt and relevant completion. The CSV, TSV, XLSX, JSON or JSONL files can be passed whereas the output is saved as JSONL after a few guided steps and suggestions. Here’s the command:

openai tools fine_tunes.prepare_data -f <LOCAL_FILE>

Create Fine-Tuned Model

After preparing the training data, the fine-tuning job can be started using:

openai api fine_tunes.create -t <TRAIN_FILE_PATH> -m <BASE_MODEL>

Where BASE_MODEL can be selected from Davinci, Curie, Babbage, and Ada. The name of the fine-tuned model can be customized as well using the suffix parameter. Running the above command does several things:

  1. Uploads the file using the files API
  2. Creates a fine-tune job
  3. Streams events

After the job is started, it takes time to complete depending upon the queue of the system, size of the dataset, etc. If interrupted, the job can be resumed by running:

openai api fine_tunes.follow -i <YOUR_FINE_TUNE_JOB_ID>

For detailed documentation, check out the official guide.

Customizing vs Prompt Design

The research shows that with 100 examples, the GPT-3 performance can be improved and it continues to improve as more data is added. It is also found that each doubling of the number of examples tends to improve quality linearly.

With one of our most challenging research datasets, Grade School Math problems, fine-tuning GPT-3 improves accuracy by 2 to 4x over what’s possible with prompt design.

OpenAI provided customized GPT-3 to improve the reliability of output, offering more consistent results that you can count on for production use-cases.
Performance Comparison of Customized GPT-3 vs Prompt Design

Conclusion

The GPT-3 is capable of performing text generation, summarization, classification, or any other natural language task whereas customizing GPT-3 will improve the performance. The customized GPT-3 improves the reliability of output, offering more consistent results that you can count on for production use-cases.

I hope you have enjoyed this overview of Customized GPT-3. Follow me for such overviews. Check out My Profile for more blogs. Stay tuned!

--

--

M. Haseeb Hassan

I write about Technology, Artificial Intelligence & Machine Learning whereas a non-tech part of me writes about Life, Darkness, Silence and Peace..