A review on Microsoft Phi-3: Pocket-sized GPT-3.5

Published in

bimser.tech

8 min readJul 5, 2024

In today’s rapidly growing AI race, language models have become the most intriguing advancement in the field. With the very high processing power requirements of the Large Language Models (LLMs), Small Language Models (SLMs) become more and more popular among AI community. After the success of Google’s Gemma and Microsoft’s Phi-2 models, Microsoft recently published their latest models: The Phi-3 series models.

At Bimser we are also taking advantage of Microsoft’s latest Phi-3 models to develop AI tools for our privacy sensitive use cases. In this article we are going to share our experience with Phi-3 models and we are going to reveal some of our use cases.

About Phi-3 Series Models

As we write this review there are various different options of Microsoft’s latest Phi-3 series models as you can see below. There are small and large context size options with different parameter sizes such as 3.8B for mini, 7B for small and 14B for medium. Also one can not ignore Microsoft’s most intriguing addition to this list is the vision model with image understanding capabilities.

From the mini version to rest of the Phi-3 models, all of them outperforms their similar sized alternatives in addition to that, these models are even comparable to OpenAI’s latest models for it’s reasoning capabilities.

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

Advantages Of Using Phi-3-mini Over Its Larger Sized Opponents

In this part we are going to focus on the smallest variant of Phi-3 series models: Phi-3-mini, due to it’s lower processing power requirement.

It is always easier to maintain and, in many cases, it is even cheaper to use LLMs as paid third party services, like using OpenAI’s API. But in some cases you may not be comfortable sending any data over the internet. And this is where smaller language models with strong reasoning capabilities like Phi-3-mini shines because of its lower processing power requirement to be able to deploy this model locally.

But Phi-3-mini is not perfect either. Even all these benefits we discussed above, this model is limited by its size store too much factual knowledge. But this issue can be resolved by using a search engine to retrieve context and feeding the model with context infused prompts.

Some Use Cases With Comparisons To GPT-3.5

Translator

In this case we compared Phi-3-mini to GPT-3.5-turbo-1106 on a Turkish to English translation task. We gave the model some UI component names from Bimser’s Synergy CSP application to translate from Turkish to English as an example as shown in the prompt below:

Translate following flow item names from Turkish to English:
Akış Başlangıcı
Ünvan
Akış Bitişi
Akış Devam Ettirici
Akış Tetikleyici
Atama
Karşılaştırma
Departman
Değişken
Akış Durumu
Paralel Akış
Ve Operatörü
Veya Operatörü
Düğüm Girişi
Düğüm Çıkışı
Yorum

Answer of GPT-3.5-turbo-1106:

Flow Start
Title
Flow End
Flow Continuer
Flow Trigger
Assignment
Comparison
Department
Variable
Flow Status
Parallel Flow
And Operator
Or Operator
Node Input
Node Output
Comment

Answer of Phi-3-mini-4k-instruct:

Source flow item names in Turkish:
Akış Başlangıcı - Start of Flow
Ünvan - Title/Section
Akış Bitişi - End of Flow
Akış Devam Ettirici - Continuation of Flow
Akış Tetikleyici - Pause in Flow
Atama - Decision
Karşılaştırma - Comparison
Departman - Department/Section (in the context of a flow diagram)
Değişken - Variable
Akış Durumu - State
Paralel Akış - Parallel Flow
Ve Operatörü - And Operator
Veya Operatörü - Or Operator
Düğüm Girişi - Loop Start
Düğüm Çıkışı - Loop End
Yorum - Comment/Note

Please note that "Departman" in this context is likely referring to a section or module within the flow diagram, rather than an actual department. The translation can vary based on specific usage scenarios and domain-specific jargon.

Coding

In this case we compared Phi-3-mini to GPT-3.5-turbo-1106 on a code editing task. We gave the model a piece of Python code and described the changes we wanted as an example shown in the prompt below:

Edit this Python code to use list comprehension instead of a for loop:
numbers = [1,2,3,4,5,6,7,8]
odd_numbers = []
for item in numbers:
  if item % 2 == 1:
    odd_numbers.append(item)
print(odd_numbers)

Answer of GPT-3.5-turbo-1106:

Here's the Python code using list comprehension instead of a for loop:

```python
numbers = [1, 2, 3, 4, 5, 6, 7, 8]
odd_numbers = [item for item in numbers if item % 2 == 1]
print(odd_numbers)
```

Answer of Phi-3-mini-4k-instruct:

You can use list comprehension to achieve the same result with a more concise and efficient code as follows:

```python
numbers = [1,2,3,4,5,6,7,8]
odd_numbers = [item for item in numbers if item % 2 == 1]
print(odd_numbers)
```

Json Outputs

In this case we compared Phi-3-mini to GPT-3.5-turbo-1106 on a json outputting task. We combined this task with the Translator task and asked the model to give the translations in json format as an example shown in the prompt below:

Translate following flow item names from Turkish to English create a json with translations as answer:
Akış Başlangıcı
Ünvan
Akış Bitişi
Akış Devam Ettirici
Akış Tetikleyici
Atama
Karşılaştırma
Departman
Değişken
Akış Durumu
Paralel Akış
Ve Operatörü
Veya Operatörü
Düğüm Girişi
Düğüm Çıkışı
Yorum

Answer of GPT-3.5-turbo-1106:

{
  "Akış Başlangıcı": "Flow Start",
  "Ünvan": "Title",
  "Akış Bitişi": "Flow End",
  "Akış Devam Ettirici": "Flow Continuer",
  "Akış Tetikleyici": "Flow Trigger",
  "Atama": "Assignment",
  "Karşılaştırma": "Comparison",
  "Departman": "Department",
  "Değişken": "Variable",
  "Akış Durumu": "Flow Status",
  "Paralel Akış": "Parallel Flow",
  "Ve Operatörü": "And Operator",
  "Veya Operatörü": "Or Operator",
  "Düğüm Girişi": "Node Input",
  "Düğüm Çıkışı": "Node Output",
  "Yorum": "Comment"
}

Answer of Phi-3-mini-4k-instruct:

{
  "StartTransaction": "Akış Başlangıcı",
  "Section": "Ünvan",
  "EndTransaction": "Akış Bitişi",
   "RollbackTransaction": "Akış Devam Ettirici",
  "Commit Transaction": "Akış Tetikleyici",
  "Abort": "Atama",
  "Search for Matches": "Karşılaştırma",
  "Department": "Departman",
  "Variable": "Değişken",
  "Transaction Status": "Akış Durumu",
  "Parallel Transaction": "Paralel Akış",
  "And Operator": "Ve Operatörü",
  "Or Operator": "Veya Operatörü",
  "Start Rollback": "Düğüm Girişi",
  "End Rollback": "Düğüm Çıkışı"
}

Chatting With Your Local Knowledge Base

In this case we compared Phi-3-mini to GPT-3.5-turbo-1106 on a Q/A task. We gave the model a question and a context to answer from as an example shown in the prompt below:

Context:
The flow start object is the object that specifies the starting point of the flow of the designed process. This object should be at the top of every flow design. Other objects to be used in the flow design are ordered after this flow start object and relate to each other with the connection object. Only 1 in-stream object is allowed on a flow screen.
While the authoring interface is active screen flow, the Toolbox panel lists flow objects. Sliding a Start Stream object from the Toolbox panel and dropping it onto the flow screen adds the object to the stream. Clicking the start stream object will list the properties that the object has in the Property Viewer panel.

Answer questions below using the context above:
Q1: How to use the Flow Start object?

Answer of GPT-3.5-turbo-1106:

To use the Flow Start object, you can drag it from the Toolbox panel and drop it onto the flow screen. This will add the object to the stream. Once added, you can click on the start stream object to view its properties in the Property Viewer panel. The Flow Start object should be placed at the top of every flow design as it specifies the starting point of the flow.

Answer of Phi-3-mini-4k-instruct:

A1: To use the Flow Start Object, you should follow these steps while working within the active screen flow and Toolbox panel of an authoring interface:

1. Ensure that your design process has a single in-stream (start stream) on each flow screen as per the guidelines provided. This start stream object will serve as the starting point for your designed process.
2. Locate the Flow Start Object within the Toolbox panel, which lists all available flow objects you can use in your design.
3. Select the Flow Start Object by clicking on it to add it to a flow screen. Once selected, drag and drop it onto an empty area of a flow screen, effectively adding it as the starting point for your process.
4. After placing the Flow Start Object on the flow screen, click on it to open its properties in the Property Viewer panel. This will allow you to review or modify any settings related to this object according to your design needs.
5. Continue designing your process by adding other objects listed in the Toolbox panel, which can be connected using Connection Objects and must follow the Flow Start Object as their starting point.

Remember that each flow screen should only contain one in-stream (Flow Start) object to ensure a well-organized and properly functioning designed process.

Performance of Locally Deployed Phi-3-mini Model On Legacy Hardware

We deployed int4 quantized version of Phi-3-mini model to a machine with a Intel® Core™ i7–8850H CPU using llama.cpp Python bindings. For the example use cases above token counts and response times are shown in the Response Time Table below:

Conclusion

As seen in the use case example results, Phi-3-mini is not performing as good as GPT-3.5 at Turkish to English Translation task. This can indicate lack of Turkish examples in the training data and results may be improved by fine-tuning the model with more Turkish examples.

Even though the coding example is fairly simple we can’t ignore that Phi-3-mini’s answer is just same as GPT-3.5’s and this is very promising.

In the json output example, last item is missing but this can be resolved easily by prompt tuning.

We received reasonably detailed and on point answer from Phi-3 this answer can be improved by prompt tuning too.

As seen in the Response Time Table, response times are very reasonable considering generations are done by only using CPU.

References

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

A review on Microsoft Phi-3: Pocket-sized GPT-3.5

About Phi-3 Series Models

Advantages Of Using Phi-3-mini Over Its Larger Sized Opponents

Some Use Cases With Comparisons To GPT-3.5

Translator

Coding

Json Outputs

Chatting With Your Local Knowledge Base

Performance of Locally Deployed Phi-3-mini Model On Legacy Hardware

Conclusion

References

Written by Gökhan Gerdan