AI Copilot with Phi-3 4bit ONNX Models using wxPython

alex buzunov
CodeX
Published in
4 min read4 days ago

--

The Phi-3 ONNX Copilot represents a tool for developers and data scientists, merging the power of AI with practical coding assistance. Designed to enhance productivity and streamline code development Whether you’re debugging complex functions or refining your code to adhere to best practices, the Phi-3 ONNX Copilot stands as a reliable companion in your coding journey. This article introduces you to its key features, operational guidelines, and the significant impact it can have on improving coding efficiency and accuracy. Welcome to the future of intelligent coding assistance.

After required configuration we can use crude wxPython UI to access chat and copilot functionality.

Chat

This tab let you chat without conversation history.

Model Settings and Response

Below the input field, the model settings are displayed, which include parameters such as:

  • do_sample
  • max_length
  • min_length
  • top_p
  • top_k
  • temperature
  • repetition_penalty

The values for these parameters are set as follows:

  • do_sample: False
  • max_length: 2048
  • min_length: 1
  • top_p: 0.9
  • top_k: 50
  • temperature: 1.2
  • repetition_penalty: 1.1

Model List

Here is the list of models available in the Phi-3 ONNX Copilot application, as provided:

  1. mini_dml_4k\directml\directml-int4-awq-block-128
  2. mini_dml_128k\directml\directml-int4-awq-block-128
  3. medium_dml_4k\directml-int4-awq-block-128
  4. medium_dml_128k\directml-int4-awq-block-128

These models differ in their size and capabilities, with both “mini” and “medium” versions available in 4k and 128k configurations, utilizing DirectML with int4 AWQ block settings.

Chat History Support

The ‘Hist’ tab is designed to keep a record of all previous interactions with the model. This allows users to refer back to previous questions and answers, making it easier to track the flow of conversation and revisit important information. Here’s how it works:

  • Session Logging: Every user query and corresponding model response could be logged and displayed in the ‘Hist’ tab.
  • Indexed Responses: Responses might be indexed or numbered, allowing users to reference specific answers, as seen in the query “tell me more about #1”.
  • Retrieval: Users could quickly retrieve and review previous interactions, which is particularly useful for long-term projects or detailed inquiries.

Copilot

The ‘Copilot’ tab in the Phi-3 ONNX Copilot application is specifically tailored to assist users with Python coding queries related to a particular file, in this case, ‘common.py’. Users can ask questions about the code, and the AI model provides detailed feedback and suggestions for improvements. This functionality is highly beneficial for code review, debugging, and adhering to best coding practices. The interface effectively combines the code display with the interactive chat, allowing for a seamless code analysis experience.

Top_k

Clicking on the ‘top_k’ text label (highlighted in yellow) opens a help window that provides an explanation of the ‘top_k’ model input parameter. Here’s a detailed description based on the provided image:

Model Settings and Response:

For each of these help is defined:

  • do_sample
  • max_length
  • min_length
  • top_p
  • top_k (highlighted in yellow and clicked)
  • temperature
  • repetition_penalty

When users click on the ‘top_k’ text label, a help window opens, providing a detailed explanation of how the ‘top_k’ parameter affects the model’s behavior and output. This feature enhances the user experience by offering immediate, context-specific guidance on the parameters that can be adjusted to tailor the model’s responses.

Source

Conclusion

As you embark on this technical journey, keep in mind the importance of efficient memory management, optimized inference, and effective tokenization strategies. Experiment with different model architectures, fine-tune hyperparameters, and explore techniques such as transfer learning and few-shot learning to enhance the performance and adaptability of your AI-powered application.

The field of AI and deep learning is rapidly evolving, and the integration of Phi-3 ONNX models with wxPython showcases the potential for creating interactive and intelligent applications. By leveraging the power of these advanced models and the flexibility of the wxPython GUI framework, developers can unlock new frontiers in natural language processing, code generation, and real-time chat interactions. Embrace the technical intricacies, explore the vast possibilities, and contribute to the advancement of AI-powered applications that redefine the way we interact with and harness the potential of artificial intelligence.

--

--