Part II (POC)— Beyond the Buzz: Highlighting the Impact of AI in Modernizing Application

Straw man with ideas (courtesy : Pixabay)

In this second part of the blog, I delved into putting the method I talked about in part 1 into action. I considered a few important things while working on this Proof of Concept (POC):

  1. Self-contained Setup: I wanted the entire system to be self-contained, meaning it could be deployed and run on my personal computer without relying on external factors.
  2. Lightweight Frameworks: I chose lightweight frameworks, keeping the technology stack simple and efficient, ensuring that the system doesn’t require too many resources.
  3. Easy Deployment and Demonstration: I aimed for a setup that could be deployed and demonstrated quickly, ideally taking less than an hour to set up and run. This makes it convenient for showcasing the proof of concept to others.
  4. Simplicity is Key: I intentionally kept things less complicated. I used only the essential technologies needed to make it work, avoiding unnecessary complexities. This approach ensures that the solution is straightforward and easy to understand.

With these considerations in mind, I successfully created a basic proof-of-concept, often referred to as a “straw man.” I should note that this version may involve some unconventional methods or “hacks,” and I’ve provided a detailed explanation of each step in the blog post. As always, fasten your seatbelts and enjoy the journey through the details of the implementation.

Implementation details and Frameworks used:

In this Proof of Concept (POC), I utilized the following frameworks and tools to bring the project to life:

  1. Python: Python served as the main programming language for this project. It is an interpreted, object-oriented, and high-level programming language known for its simplicity and versatility.
  2. Llama.cpp: Llama.cpp is a C/C++ port of Llama. It allows the local execution of Llama 2 using 4-bit integer quantization on Macs. Additionally, Llama.cpp offers support for Linux and Windows operating systems.
  3. Langchain: LangChain is a framework designed for developing applications powered by language models. It adds context awareness and reasoning capabilities to AI applications, enhancing their functionality.
  4. Flask-RESTful: This is an extension of Flask, a web framework for Python. Flask-RESTful provides the necessary components for building robust and effective REST APIs (Application Programming Interfaces).
  5. Streamlit: Streamlit is a Python-based framework created specifically for machine learning engineers. It is an open-source framework that facilitates the rapid development and sharing of web applications, making it a valuable tool for projects like this.

These frameworks and tools were chosen with the POC’s goals in mind, prioritizing efficiency, compatibility, and ease of integration. Each of them played a crucial role in implementing and demonstrating the proof of concept, showcasing the diverse capabilities they bring to the project.

From the implementation perspective, the steps outlined in Part I were followed as close as possible for this Proof of Concept (POC). Here’s a breakdown of the process:

  1. Select a Pre-trained Open Source Model: I opted for Llama2 due to its popularity and the added advantage of a local CPU option with Llama.cpp.
  2. Fine-tune the Model ONCE with Enterprise Data: I chose to SKIP this step for this version of the POC due to the time commitment and work involved. However, skipping this step had significant consequences, which I addressed through post-processing hacks. These consequences are explained in detail later in the blog.
  3. Customize Existing APIs for Real-time Business Data: I developed a pseudo NLP processing function to retrieve parameters for business services. I also modified the return type of business services to align with the LLM prompt.
  4. Modernize Web and Mobile User Interface: I constructed a conversational UI using the Streamlit framework.
  5. Deploy the model for consumption: For this POC, I deployed the model locally using the Streamlit and Flask-RESTful frameworks.

These steps represent the sequence of actions taken to implement the POC. Now that the approach is established, let’s delve into the flowchart and architecture for a more detailed understanding.

Image 1: Implementation flowchart

The above flowchart illustrates the sequence of events executed in the POC. As depicted, if the “Use RAG” option is not selected, the backend business service is bypassed, and the question is directly handled by the LLM. On the other hand, if the “Use RAG” option is selected, the backend business service is invoked after pre-processing the question to extract parameters to pass. The returned information from the backend service is used to enhance the LLM prompt with additional context, and then the LLM is invoked.

It’s worth noting that I’ve implemented a post-processing step, a workaround to address quality issues resulting from skipping the fine-tuning step. Following post-processing, the response is displayed to the user. This additional step ensures a more refined and improved output for a better user experience.

Now, let’s delve into the crucial aspects of pre-processing, backend business service, and post-processing implementation.

The pre-processing and business service components are implemented in the restservice.py file. In the pre-processing phase, I've incorporated a straightforward logic, considering a list of actions (services), receiver names, and subscription services. Depending on the breadth of enterprise use cases, it's advisable to implement a more sophisticated NLP function using libraries like nltk or spaCy. For illustration, I've included a sample nltk function in the ProcessUserPromptWithNLP class.

The actual business service is a simulated information generator. Based on the parameters provided, it returns a JSON response with the outcome. To emulate real-life scenarios, I use various lists in the code to generate diverse combinations, simulating a rich context that is then fed into the prompt before sending it to the LLM for processing.

I added a post-processing step as a workaround because I didn’t fine-tune the model. The foundational models are trained on a broad range of data to spot scams and threats. Since our focus is on financial services, Llama2 may wrongly identify them as scams and give caution or warning messages. Fine-tuning the model initially, as I explained in part 1 of this blog series, would prevent this. However, for this POC, I used a hack. It essentially looks for signs of warning or caution messages and uses the original response from the backend business service to override them. The post-processing is implemented in the app.pyfile under Process_llm_output function.

Key Learnings :

The following are some of the key insights I gained from this POC

  1. Essential Need for Fine-Tuning: Fine-tuning is crucial when adopting this approach. Foundation models are designed to be generalists, and they might not align well with most enterprise needs. Alternatively, one can explore domain-specific models, but finding open-source options can be challenging.
  2. Dynamic Ecosystem Requires Adaptability: The AI ecosystem evolves rapidly, necessitating a proactive approach to keeping up with frequent code refinements. For eg., The sample code and blogs I referred to during this exercise became outdated within a few months, rendering them non-functional.
  3. Challenges in Pre-processing Implementation: Implementing pre-processing function to extract parameters from user questions may become more complex depending on the range of use cases you want to address with LLM. It’s advisable to utilize a Natural Language Processing (NLP) library to ease this process. Once implemented, ongoing improvements can be made with relative ease.

Try it Yourself !

Congratulations on making it this far! If you’re feeling adventurous and want to give it a spin yourself, fear not! Head over to my GitHub repository at https://github.com/AhilanPonnusamy/LLM-and-AppModernization/tree/main for a detailed guide and all the source code magic. Enjoy the journey, and I’m eagerly looking forward to hearing about your experiences!

Summary:

In this exploration, I have delved into the practical implementation of a Proof of Concept (POC) for the integration of Artificial Intelligence (AI) within App Modernization, with a focal point on leveraging existing backend business services. The process involved crucial steps, including model selection, API customization, and UI modernization, utilizing Python, Llama2, Langchain, Flask-RESTful, and Streamlit as foundational technologies. Emphasizing the significance of fine-tuning for optimal model performance, the POC showcased innovative post-processing to address skipped steps.

I have also shared the insights that were gathered, underscoring the importance of staying abreast of the rapidly evolving AI ecosystem. Recommendations were made for employing Natural Language Processing (NLP) libraries to navigate the complexities of pre-processing. As a core aspect of the POC, the integration strategy centered on the strategic reuse of existing backend business services.

References:

Despite the absence of a single working reference serving as a baseline for this POC, I relied heavily on the following blogs and documents, incorporating source code from some of these references to guide and shape the successful implementation.

--

--