Member-only story
LLMs Pitfalls
An introduction to some of the key components surrounding LLMs to produce production-grade applications
Introduction
Since the rise of ChatGPT, Large Language Models (LLMs) have become more and more popular also for non-technical people. Although LLMs on their own cannot provide yet a full product ready to be served to a vast audience. As part of this article, we will cover some of the key elements that are used to make LLMs production-ready.
Fine-tuning
Datasets
Models like LLAMA are able to predict next tokens in a sequence although this doesn’t necessarily make them suited for tasks such as question answering. Therefore in order to optimize these models different types of datasets can be used:
- Raw completion: if the goal is predicting the next token we provide some input text and let the model progressively predict the upcoming steps.
- Fill in the middle objective: in this case we have some starting and ending text and the model is learning to fill the gap. This approach is quite popular to create code completion models like Codex.
- Instruction datasets: the goal here is to teach the model how to answer questions. We have questions (instructions) as…