Phi-3: a new set of small language models from Microsoft

Published in

Startup Reviews

2 min readMay 6, 2024

At the end of April, Microsoft announced the release of its new set of compact AI models, Phi-3. Despite their mini format, compact models (SLM) are superior in functionality to some of their larger counterparts.

The set includes three models: Phi-3-mini (3.8 billion parameters), Phi-3-small (7 billion parameters) and Phi-3-medium (14 billion parameters).

Phi-3-mini is a 3.8 billion parameter model trained on 3.3 trillion tokens. Its context window is 4K tokens and 32K words, and in the extended version Phi-3-mini-128K the window is expanded to 128K tokens. According to the MMLU benchmark, the performance of this model is 69%, according to MT-bench – 8.38. The Phi-3-mini-128K version, according to information from the company’s blog, in terms of processing and generating texts in natural language is on par with Mixtral 8x7B from Mistral AI and GPT 3.5 from OpenAI – despite the fact that its mini format allows you to deploy tool even on a smartphone.

Phi-3-small has 7 billion parameters and is trained on 4.8 trillion tokens. The context window here is 8K tokens, the vocabulary is 100K words.

Phi-3-medium has 14 billion parameters and is trained on 4.8 trillion tokens.

At the moment, the company has only released the Phi-3-mini, and the other two models are still being trained and will be released soon.

A distinctive feature of the mini-model is not only its performance, but also the ability to be deployed on devices of different power without compromising the quality of work. This is due to the fact that smaller models can process data faster, while requiring less operating costs and computing resources.

Phi-3-mini is currently available in both 4K and 128K formats on the Microsoft Azure cloud platform, as well as the open source platforms Hugging Face and Ollama.

The developers specifically emphasized the benefits that small language models can bring when properly trained. In particular, the Phi-3-mini could pave the way for higher-performance compact models that can run on mobile devices. Of course, at this stage of development, their performance does not match such large competitors as GPT-4, but in the foreseeable future the gap between large and small models can be significantly reduced.

Phi-3: a new set of small language models from Microsoft

Written by Catherine Chef