How to Fine-tune, Quantize, and Run Microsoft phi-1.5

A model pre-trained for many tasks

9 min readOct 4, 2023

Microsoft released phi-1.5, a new large language model (LLM) with 1.3 billion parameters.

It’s 5.4 times smaller than the smallest Llama 2 model (Llama 2 7B). Yet, according to the evaluation conducted by Microsoft, and published on arXiv, phi-1.5 significantly outperforms Llama 2 on several tasks.

Given its relatively small size and the claimed performance, phi-1.5 is a good candidate LLM for affordable AI.

In this article, we will see what could explain this performance: how the model was trained and what training data has been used. I also show how to fine-tune, quantize, and run the model. I benchmark its memory consumption and inference speed.

This article has been originally published in The Kaitchup, my newsletter.

For more articles like this and support my work, consider subscribing to The Kaitchup:

The Kaitchup - AI on a Budget | Benjamin Marie, PhD | Substack

Weekly news, tips, and tutorials on fine-tuning, running, and serving large language models on your computer. Each…

kaitchup.substack.com

phi-1.5: The Power of Distillation

How to Fine-tune, Quantize, and Run Microsoft phi-1.5

A model pre-trained for many tasks

The Kaitchup - AI on a Budget | Benjamin Marie, PhD | Substack

Weekly news, tips, and tutorials on fine-tuning, running, and serving large language models on your computer. Each…

phi-1.5: The Power of Distillation

Written by Benjamin Marie