How to Fine-tune, Quantize, and Run Microsoft phi-1.5

A model pre-trained for many tasks

Benjamin Marie
9 min readOct 4, 2023

Microsoft released phi-1.5, a new large language model (LLM) with 1.3 billion parameters.

It’s 5.4 times smaller than the smallest Llama 2 model (Llama 2 7B). Yet, according to the evaluation conducted by Microsoft, and published on arXiv, phi-1.5 significantly outperforms Llama 2 on several tasks.

Given its relatively small size and the claimed performance, phi-1.5 is a good candidate LLM for affordable AI.

In this article, we will see what could explain this performance: how the model was trained and what training data has been used. I also show how to fine-tune, quantize, and run the model. I benchmark its memory consumption and inference speed.

This article has been originally published in The Kaitchup, my newsletter.

For more articles like this and support my work, consider subscribing to The Kaitchup:

phi-1.5: The Power of Distillation

--

--

Benjamin Marie

Ph.D, research scientist in NLP/AI. Medium "Top writer" in AI and Technology. Exclusive articles and all my AI notebooks on https://kaitchup.substack.com/