MarketMaestro: Building and Aligning a Local AI Stock Advisor Agent with InstructLab, Podman AI Lab and LangChain (Part 1: Agent Setup)

9 min readAug 4, 2024

As AI systems become more integrated into sensitive domains like finance, healthcare, and legal services, ensuring their alignment with human values and safety considerations becomes crucial. The way we approach this alignment can vary greatly depending on how and where these AI systems are deployed. For instance, an AI system directly providing advice to customers would need different guardrails compared to an internal AI system supporting professional advisors. How should we then approach the design, build and evaluation of AI systems that prioritise safety and ethical considerations?

Enter MarketMaestro, an experimental Financial AI agent which acts as a demonstrative project designed to explore the process of AI alignment in the context of stock recommendations and investment. While focusing on a stock recommendation scenario, the intent is to illustrate key principles of AI alignment that are applicable across various fields where AI assistants are being integrated.

MarketMaestro is built to run locally using small language models and open-source tools, an approach that offers several key benefits:

Enhanced Privacy: By processing data locally, we can ensure that sensitive financial information never leaves the user’s device.
Greater Transparency: Smaller models and open-source components are often more interpretable, allowing us to better understand and explain their decision-making processes.
Improved Control: Local deployment allows for easier fine-tuning and adjustment of the model’s behaviour to align with specific ethical guidelines, potentially even tailoring them to the analyst style.
Reduced Resource Requirements: Small models can run efficiently on personal computers, democratising access to AI-powered analysis.
Openness and Auditability: The use of open-source tools allows for thorough code review and community-driven improvements, crucial for systems handling sensitive data.

Through MarketMaestro and this series of articles, we’ll explore how to create not just functional AI systems, but aligned AI systems with safety and ethical guardrails. We’ll also discuss how the system architecture and alignment process would evolve if this were a real system deployed in various scenarios — from a professional advisory tool to a direct consumer application.

In this first article, we’ll dive into the architecture, implementation and the evaluation framework behind MarketMaestro, from setting up the development environment to implementing ethical guidelines and evaluating the system’s adherence to them. In later articles, we’ll then unpack the process of aligning the AI agent to improve safety evaluation and the technical evolution of the system to support extended use-cases.

Whether you’re a developer interested in AI safety, a data scientist exploring ethical AI development, or a professional curious about the implications of AI in sensitive domains, this journey through the creation and alignment of MarketMaestro can hopefully offer valuable insights into responsible AI development.

Project Overview

MarketMaestro is designed as a modular system, leveraging several open-source technologies to create a locally-run, aligned AI stock advisor. Let’s break down the key components of the system and understand their purposes and selection criteria, following the simple FlowChart below.

Components

InstructLab

InstructLab is used specifically for model alignment in our project. It employs a method called Large-scale Alignment for chatBots (LAB), which enhances LLMs using far less human-generated information and fewer computing resources than typical retraining methods. InstructLab’s approach includes taxonomy-driven data curation, large-scale synthetic data generation, and iterative, large-scale alignment tuning. This allows us to improve our model’s alignment efficiently, making it more suitable for specific contexts and use cases of providing stock recommendations.

Podman AI Lab

Podman AI Lab serves as our local environment for working with Large Language Models. Podman AI Lab provides by default a recipe catalog with common AI use cases, a curated set of open source models, and a playground for learning, prototyping, and experimentation. In our case, we use Podman AI Lab to serve our InstructLab merlinite-7b-lab model locally, ensuring data privacy and security. This approach also allows us to quickly get started with AI in our application without depending on infrastructure beyond our laptop.

LangChain

LangChain is the backbone of our natural language processing (NLP) pipeline, including the creation of summarisation and recommendation chains. It offers a flexible and powerful framework for building language model applications, allowing us to easily create complex NLP chains essential for our AI agent to process financial reports and generate recommendations. LangChain’s modular nature facilitates easier alignment and adjustment of the AI’s behaviour. Its comprehensive toolkit for agent creation allows us to define the agent’s behaviour, decision-making process, and interaction with other components.

We also leverage LangChain for building the agent custom evaluator. For evaluation, LangChain’s flexible prompt templates and output parsers enable us to create a robust system for assessing the AI’s performance and alignment with our model guidelines including safety requirements.

ChromaDB

ChromaDB serves as our vector database for implementing Retrieval-Augmented Generation (RAG). It provides an efficient way to store and retrieve relevant information from financial reports, which is crucial for providing context to our language model, allowing it to make more informed recommendations from recent financial data, in this case 10-K and 10-Q filings from the US Stock Exchange and Securities Commission. We use ChromaDB with its in-memory mode and load financial reports dynamically as part of the recommendation process for ease of deployment and repeatability.

PyPDFLoader

We use PyPDFLoader for loading and processing PDF documents, specifically annual reports. Many financial reports are available in PDF format, and PyPDFLoader provides a straightforward way to extract text from these documents, which is necessary for our information retrieval system.

System Flow

The MarketMaestro system operates as follows:

Annual reports are loaded and processed using PyPDFLoader.
Processed text is embedded and stored in ChromaDB.
When a user query is received, the AI agent (built with LangChain) performs the following steps: a. Retrieves relevant information from ChromaDB. b. Generates a summary of the retrieved information. c. Uses the summaries to generate a response, including stock recommendation.
The recommendation is then evaluated by the custom evaluator (also built with LangChain) for relevance, specificity, justification, diversity and risk awareness.
The final recommendation including evaluation scores and the basis for evaluation are provided to the user.

This architecture allows for a flexible, locally-run AI system that can provide stock recommendations while supporting privacy, transparency, and ethical alignment. In the following sections, we’ll dive deeper into the system setup and key design considerations in the model evaluation.

Setting Up the Environment

To get MarketMaestro up and running on your local machine, you’ll need to set up several components. This section will guide you through the process of installing and configuring the necessary tools and libraries.

Prerequisites

Before we begin, ensure you have the following installed on your system:

Python 3.8 or higher
pip (Python package installer)
git

Step 1: Set Up InstructLab

InstructLab is used for model alignment in our project. To set it up:

Follow the installation instructions in the Getting Started section of the InstructLab README in GitHub to set up the environment and dependencies.
Make sure you download the merlinite-7b-lab model, which is an InstructLab-enhanced version of the Mistral model.

Step 2: Set Up Podman AI Lab

Podman AI Lab serves as our local environment for working with Small Language Models.

Install Podman Desktop from the official website: https://podman-desktop.io/
Install the Podman AI Lab extension within Podman Desktop. For this, open Podman Desktop, navigate to the Extensions section and install the “Podman AI Lab” extension.
Use Podman AI Lab to serve the InstructLab merlinite-7b-lab model locally:

Open Podman AI Lab within Podman Desktop
Navigate to the Models section
Import the merlinite-7b-lab model you downloaded in Step 1, via its GGUF file.

Start a model service for merlinite-7b-lab. This can be easily achieved by creating a playground (which will create and start a service) which also allows you to test the working condition of your model by chatting via its endpoint.
Finally look at the services details as you need the server endpoint to configure the agent.

Step 3: Set Up the MarketMaestro Agent & Evaluator

Now that we have InstructLab and Podman AI Lab set up, let’s set up our MarketMaestro agent.

Clone the MarketMaestro repository:

git clone https://github.com/caldeirav/MarketMaestro

Follow the installation instructions in the repository README file
You should now be able to run the agent and evaluator using the dedicated python runner scripts.

python run_agent.py

python run_evaluator.py

Model Evaluation Methodology

Approach to Evaluation

MarketMaestro employs a custom evaluation approach using an AI model to assess the quality and safety of its own stock recommendations and provide a scoring on each evaluation criteria. This self-evaluation mechanism is designed to provide a more nuanced and context-aware assessment compared to traditional rule-based evaluation systems.

Our evaluation methodology currently focuses on five key criteria:

Relevance: How well the recommendation addresses the user’s query
Specificity: The level of detail and precision in the recommendation
Justification: The quality and depth of reasoning behind the recommendation
Diversity: The range of stocks or sectors covered in the recommendation
Risk Awareness: The acknowledgment and explanation of potential risks

The evaluator analyses the agent’s responses against these criteria, providing a score and detailed feedback for each, using a list of pre-defined questions. Final evaluation results based onthe average scores across the list of assessment questions is provided as the final result of the evaluation. Furthermore, the criteria and list of questions used in the assessment can be easily modified in our code configuration.

Advantages of AI-based Evaluation

Using an AI model for evaluation offers several advantages:

Contextual Understanding: The AI can interpret nuances and context in the recommendations, leading to more accurate evaluations.
Scalability: The system can handle a large volume of evaluations without human intervention.
Consistency: The AI applies the same criteria uniformly across all evaluations.
Adaptability: The evaluation model can be fine-tuned to adapt to changing market conditions or evaluation requirements.
Continuous Improvement: Insights from the evaluations can be used to further refine the recommendation agent.

However it should be noted that at this point we use the same underlying model (merlinite-7b-lab) for both the recommendation agent and the evaluator. This approach has both potential advantages and limitations:

Advantages:

Consistency in language understanding and generation
Simplified deployment and maintenance
Potential for the model to leverage its own “understanding” to provide more insightful evaluations

Limitations:

Potential for shared biases between the agent and evaluator
Risk of the evaluator being “blind” to certain types of errors that the agent might make

Current Challenges and the Need for Further Alignment Work

Despite the sophisticated evaluation system, we’ve identified several challenges with the current model responses. These challenges highlight the need for further alignment efforts to improve MarketMaestro’s performance, reliability, and safety:

Specificity in Recommendations: The model occasionally lacks specificity in its stock recommendations. Further alignment work is needed to enhance the model’s ability to provide more detailed and actionable recommendations, ensuring users receive clear and precise advice.
Risk Assessment Consistency: We’ve observed inconsistent risk assessments across different queries. Improving the model’s ability to consistently and effectively communicate potential risks associated with its recommendations is crucial for responsible financial advising.
Stock Selection Bias: There’s a tendency for the model to favour well-known stocks over potentially more suitable lesser-known options. Broadening the model’s knowledge and consideration of a wider range of stocks could improve the diversity and potentially the quality of recommendations.
Market Analysis Currency: The model struggles to provide up-to-date market analysis due to its knowledge cutoff. Developing methods to keep the model current with evolving market trends and conditions, possibly through regular fine-tuning or improved integration with real-time data sources, is essential for maintaining its relevance.
Factual Accuracy: We’ve noticed occasional generation of plausible-sounding but factually incorrect information. Enhancing the model’s ability to distinguish between its confident knowledge and speculation, and improving its fact-checking capabilities, is crucial for maintaining user trust.
Ethical Considerations: Ensuring adherence to ethical guidelines in financial advising across all interactions remains a challenge. While currently the agent may have sufficient guardrails for an internal usage as a productivity assistant to a professional advisor, extending the role of the agent to consumer-facing systems with transactional capabilities will require strengthening the model’s understanding and application of ethical principles in financial advice, possibly through focused ethical training datasets, as a key area for improvement.

Conclusion

By the end of this article, we hope that readers should start grasping the fundamental challenges of aligning AI systems for safety in sensitive domains such as finance. In this regard, MarketMaestro serves as an educational tool, helping intelligent system developers better understand the practical implications of AI alignment theories and the iterative nature of developing safe AI systems.

The next article will delve deeper into MarketMaestro’s decision-making process to better adhere to ethical guidelines and safety constraints., starting with the model alignment process, and providing readers with insights into the cutting edge of AI alignment research and its real-world applications.