Microsoft’s Orca 2: A Leap Forward in AI Reasoning

Published in

ML@ABEJA

4 min readJan 26, 2024

Introduction

In June 2023, Microsoft made waves in the AI community with the release of a groundbreaking paper “Orca: Progressive Learning from Complex Explanation Traces of GPT-4”. in that paper the researchers showed that a smaller 13 billion parameter model can outperform much bigger models including chat GPT and BARD on certain benchmarks. Recently, Microsoft announced the release of Orca 2 with not only a research paper but also releasing the model weights which is a great news for the open source community. In this blog post, we will go through the technical details of Orca 2, the second version of Microsoft’s AI model, equipped with remarkable reasoning capabilities.

Orca 2: A Closer Look

The key highlight of Orca 2 is its ability to outperform much larger models. Orca 2, boasting a 13 billion parameter model, consistently outperforms a 70 billion parameter model on reasoning capabilities, showcasing its impressive prowess.

Understanding Orca 2’s Training Approach

The paper, titled “Orca 2: Teaching Small Language Models How to Reason,” focuses on enhancing the reasoning abilities of smaller models. Researchers at Microsoft contend that excessive emphasis on imitating large models may restrict the potential of smaller models. Unlike traditional approaches relying on imitation learning, Orca 2 introduces various reasoning techniques. These include step-by-step recall, direct answer generation, and tailored solution strategies for different tasks. The emphasis is on empowering smaller models to employ diverse solution strategies, not limited to imitating larger models.

Benchmarks and Performance

Orca 2’s performance speaks volumes about its training methodology. The above results showcase its superior reasoning capabilities. The 13 billion parameter Orca-2 model consistently outperforms the 70 billion parameter LLAMA-2 model on almost all the reasoning benchmarks. The 70 billion parameter WizardLM outperforms Orca-2 on the GSM8K benchmark, however Orca-2-13B is at-par when it comes to other metrics. The results highlight the effectiveness of the unique training strategies employed by Microsoft.

Running Orca 2 Locally

For those eager to explore Orca 2 locally, the process is made simple. Using the Transformer package in Python, you can load the model weights and unleash its capabilities. Additionally, Microsoft offers LM Studio, a user-friendly interface for testing large language models. In this blog, let us explore the step-by-step process of inferencing with Hugging Face library.

Step 1: Import and set-up

import torch
import transformers

if torch.cuda.is_available():
    torch.set_default_device("cuda")
else:
    torch.set_default_device("cpu")

Step 2: Load the Orca-2–13b language model using the AutoModelForCausalLM class from the Hugging Face Transformers library.

model = transformers.AutoModelForCausalLM.from_pretrained("microsoft/Orca-2-13b", device_map='auto')

Step 3: Load the tokenizer associated with the Orca-2–13b model. The use_fast=False argument ensures that the slow tokenizer is used

tokenizer = transformers.AutoTokenizer.from_pretrained(
        "microsoft/Orca-2-13b",
        use_fast=False,
    )

Step 4: Create a conversation prompt by combining system, user, and assistant messages.

system_message = "You are an AI language model. You are a cautious and polite \
                  assistant. Please carefully follow the user's instructions."

user_message = "What came first, the chicken or the egg?"

prompt = f"system\n{system_message}\nuser\n{user_message}\nassistant"

Step 5: Tokenize the prompt and generate a response from the Orca-2–13b model.

inputs = tokenizer(prompt, return_tensors='pt')
output_ids = model.generate(inputs["input_ids"],)
answer = tokenizer.batch_decode(output_ids)[0]

print(answer)

In summary, the above code demonstrates how to use the Microsoft Orca-2–13b model for conversational AI tasks, handling conversations and generating responses based on user input. From philosophical inquiries about the chicken and the egg to practical questions on AI investments, Orca 2 provides insightful and well-reasoned responses. The model’s ability to analyze market criteria and offer potential investment considerations demonstrates its practical utility.

Orca 2’s Licensing and Availability

It’s crucial to note that Orca 2 is licensed under Microsoft Research License for research purposes. While not suitable for commercial use, the base Llama 2 model, on which Orca 2 is fine-tuned, is licensed for commercial purposes. Understanding the licensing terms will ensure compliance with usage guidelines.

Conclusion

Microsoft’s Orca 2 emerges as a game-changer in the realm of AI reasoning, pushing the boundaries of what smaller models can achieve. The combination of innovative training techniques, impressive benchmarks, and user-friendly accessibility makes Orca 2 a valuable asset for researchers and developers alike. As we explore the landscape of AI advancements, Orca 2 stands as a testament to Microsoft’s commitment to pushing the frontiers of artificial intelligence.

References

Microsoft Research (2023) “Orca: Progressive Learning from Complex Explanation Traces of GPT-4” Link to paper
Microsoft Research (2023) “Orca 2: Teaching Small Language Models How to Reason” Link to paper