Should teachers bother fine-tuning GPT?
GPT-3 (Generative Pre-trained Transformer 3) is a large-scale language model developed by OpenAI. It is trained on a dataset of 45 TB of text comprising of 8 million web pages and hundreds of books. GPT-3 uses transformer-based architecture to generate human-like text. It can be used for a variety of tasks including summarization, question-answering, translation, and text generation. This is the model I am using. ChatGPT uses GPT-3.5 and we are currently anticipating GPT-4 which will be trained on even larger datasets.
Fine-tuning is a process of adjusting the parameters of a pre-trained model on a new dataset in order to further improve its accuracy for a particular task. It is a type of transfer learning which focuses on updating the parameters of an existing model to better suit a new task, instead of training a model from scratch
An experiment that we tried during sessions #6 and #7 of Tech Talk for Teachers was to test IF:
- Fine-tuning was possible given that we did not deliberately build data pipelines to facilitate this. Note: this was also towards the tail-end of December holidays.
- Assuming we had the data, could fine-tuning be meaningful — what would the output of a GPT model sound like after training it?
- The cost was worthwhile — beyond committing time, there is an explicit cost associated with fine-tuning a large dataset using tools offered by openAI as you might have seen in the thumbnail above.
The methodology adopted was to just use existing questions and answers that were available to me. For a H2 Economics teacher, I could choose between Paper 1: Case Study or Paper 2: Essays. The latter was the clear choice since this would be transferrable from a humanities perspective.
1 Is fine-tuning possible?
Yes. Anyone can do it. And you just need a csv file with prompts and responses. You can read more in the documentation here to prepare your data
2 Was fine-tuning meaningful?
Surprisingly, the fine-tuned model started to reference points on a graph! Much in the style of the examiners’ reports which was used as training input.
Of course, no Economics graphs were generated.
That said, I personally did not find this very helpful. There are clearly still some gaps that makes this not quite an ‘A’ grade essay answer.
3 Was it worth the explicit costs?
If this came out of one’s own pocket, it would be a little tricky.
At this point, the mini experiment point towards how the results do not seem to justify both the explicit costs of paying as well as implicit cost of time in preparing and cleaning the data.
Again, this is from the perspective of an average teacher and I am sure there might be other use cases for a fine-tuned model.
You can test and play with my fine-tuned model here. Note: you will need an openAI account (that you can create for free :))