LLMs customization with Low-Rank Adaptation (LoRA)

Valentina Alto
9 min readMar 19, 2024

Large Language Models (LLMs) such as GPT-4, Llama-2 and Mistral are featured by a huge amount of parameters (measured in hundred of billions) and designed to be general purpose. If you want to adapt them to specific task, the most straightforward way would be that of “asking” them to do that specific task. For example, if you want your LLM to be your sentiment analyzer, you might instruct it with a prompt like “Act as a sentiment analyzer and classify reviews into positive, neutral or negative”. There are then more advanced techniques involving prompt engineering and Retrieval Augmented Generation (RAG), yet they all share in common that the underlying model remains unmodified.

However, there might be scenarios where the mentioned approaches are not enough, and we might want to have specific model for our task, while keeping leveraging the great capabilities of LLMs. To do so, we can leverage fine-tuning, a powerful technique that allows practitioners to adapt pre-trained models to specific downstream tasks.

Traditional fine-tuning involves further training a pre-trained LLM on task-specific data to learn task-specific patterns. While effective, traditional fine-tuning might still be impractical, due to the huge amount of parameters to adjust and then store. Even if we decide to “freeze” some parameters and fine-tune just a subset…

--

--

Valentina Alto

Data&AI Specialist at @Microsoft | MSc in Data Science | AI, Machine Learning and Running enthusiast