Binome

Notes on GenAI and Cloud Infrastructure technologies

Designing LLM-Based Agents: Key Principles — Part 1

Craig Li, Ph.D
Binome
Published in
9 min readJul 4, 2024

--

In the evolving landscape of artificial intelligence, agent frameworks have unlocked potentials for developing robust, scalable, and intelligent systems. This article explores the agent design pattern, delves into the agent components and their features, and overviews several popular frameworks such as LangChain, AutoGen, CrewAI, and PhiData. A more comprehensive list can be found here — A list of AI autonomous agents. Finally, we will walk through a practical example using PhiData to create an agent that generates weekly news for Raspberry Pi enthusiasts.

Understanding the Agent

In the world of GenAI and development, an agent stands as a pivotal self-contained unit capable of performing tasks autonomously, driven by specific instructions and contextual understanding. Imagine a sophisticated entity that seamlessly integrates intelligence and functionality to execute tasks with precision. The attached diagram offers a high-level overview of this agent design pattern. Let’s delve into the components that make this pattern so effective and versatile.

When viewed from a distance, an agent has at least two interfaces: input and output. Upon closer inspection, besides the input and output, we expect to see a: large language model (LLM), which is the brain or the decision maker of the agent, an agent workflow, which orchestrates the data flows inside the agent, and a config that describes the agent. Additionally, there are components that make the agent more advanced:

  • Input and Output Proxies: These components handle the conversion of input and output schemas to and from the agent. They ensure that the data received and sent by the agent is in the correct format and enriched with necessary metadata.
  • Tools: A set of functions and built-in utilities that the LLM can suggest to call to perform specific tasks. These tools can interact with external APIs, execute SQL queries, retrieve knowledge from databases, and more.
  • Memory: Stores the prompt, context, and history of conversations. It allows the agent to maintain continuity and coherence across interactions.
  • Knowledge: A vector db that stores information extracted from interactions and external sources, helping the agent build a comprehensive understanding over time.

From the above diagram, it is evident that agents can range from very simple to highly complex. Based on our experience in developing agents for various tasks, as well as insights from research papers and tech blogs (Multi AI Agent 101, Building AI Agents), it is considered crucial for an agent to be designed for a specific-enough task to perform well. It’s important to remember that in here, we would like to use agents in business-critical tasks, and not just for instance to generate a piece of text; business requirements allow for much less margin of error.

As mentioned above, our research as well as the wider body of academic research suggests that when an LLM receives a prompt tailored to a specific task, it performs significantly better than with a generic prompt. A survey on efficient prompting methods for LLMs on arXiv discusses the diversity and detail of prompts for specific tasks and the challenges of long natural language prompts. Further insights into the mixed effects of naive prompts on LLM performance, and the importance of delivering tailored responses for tasks requiring empathy and analytical precision, are available in a research article on arXiv.

Consequently, we are of the belief that all components of the agent should also be task-specific. This includes the configuration description, knowledge base, and memory, which should all be relevant to the particular task. Below, we will quickly review some of the most well-known agent frameworks, as well as one lesser-known framework (PhiData) that aligns closely with the above diagram.

Comparing Agent Frameworks

LangChain

LangChain focuses on building language model applications that are tightly integrated with various data sources. It excels in applications requiring complex language understanding and generation capabilities.

  • Strengths: Strong language model integration, versatile data handling.
  • Use Cases: Conversational AI, content generation, and complex data querying.

AutoGen

AutoGen emphasizes automation and ease of use, providing tools to quickly build and deploy agents with minimal coding. It targets developers who need rapid prototyping and deployment capabilities.

  • Strengths: Quick setup, user-friendly interface, extensive automation.
  • Use Cases: Prototyping, automated workflows, simple task automation.

CrewAI

CrewAI is designed for collaborative environments where multiple agents work together to achieve a common goal. It focuses on coordination, communication, and synergy between agents.

  • Strengths: Collaboration-focused, robust communication protocols.
  • Use Cases: Multi-agent systems, collaborative tasks, distributed problem-solving.

PhiData

PhiData provides a comprehensive framework for building sophisticated agents with advanced memory and knowledge management. It is ideal for applications requiring deep contextual understanding and long-term learning.

  • Strengths: Advanced memory management, rich knowledge base, strong contextual understanding.
  • Use Cases: Long-term projects, knowledge-intensive tasks, personalized user interactions.

Building an Agent with PhiData

Let’s create a simple example using PhiData. Our goal is to build a simple agent that generates weekly news for Raspberry Pi enthusiasts. We will use OpenAI api default model gpt-4o, and use duckduckgo search api as tool.

Step 1: Add Dependencies

First, install PhiData and set up your project environment:

pip install phidata openai duckduckgo-search

Step 2: Defining the Agent

Define the structure of your agent, including config and tools.

Config class

from phi.llm.base import LLM

class PiNewsLetterConfig:
def __init__(
self,
name: str,
description: str,
instructions: list[str],
llm: LLM) -> None:
self.name = name
self.description = description
self.instructions = instructions
self.llm = llm

Agent class

from phi.assistant import Assistant

class PiNewsLetterAgent:
def __init__(
self,
config: PiNewsLetterConfig,
tools: list
):
self.config = config
self.tools = tools

@property
def agent(self):
return Assistant(
llm = self.config.llm,
tools = self.tools,
description = self.config.description,
instructions = self.config.instructions
)

Step 3: Write Executable Code

To simplify the process, combine the above classes into a single Python script main.py and replace the placeholder with your OpenAI API key.

from phi.assistant import Assistant
from phi.llm.openai import OpenAIChat
from phi.llm.base import LLM
from phi.tools.duckduckgo import DuckDuckGo

class PiNewsLetterConfig:
def __init__(
self,
name: str,
description: str,
instructions: list[str],
llm: LLM) -> None:
self.name = name
self.description = description
self.instructions = instructions
self.llm = llm

class PiNewsLetterAgent:
def __init__(
self,
config: PiNewsLetterConfig,
tools: list
):
self.config = config
self.tools = tools

@property
def agent(self):
return Assistant(
llm = self.config.llm,
tools = self.tools,
description = self.config.description,
instructions = self.config.instructions
)

config = PiNewsLetterConfig(
name = "pi weekly newsletter",
description = """
Your task is to provide a concise summary of the most recent and relevant weekly news related to Raspberry Pi.
This includes any new hardware releases, software updates, community projects, educational resources, and significant discussions in forums or social media.
The summary should capture essential information and insights that would be valuable for Raspberry Pi enthusiasts and developers.
""",
instructions = [
"Identify the most significant news items related to Raspberry Pi from the past week.",
"Summarize each news item, focusing on key developments and their implications.",
"Ensure that the summary is clear, concise, and free of jargon.",
"Provide context for each news item to help understand its relevance and impact.",
"Highlight any resources or links where readers can find more detailed information."
],
llm = OpenAIChat(
api_key="<your open ai api key. also can use os env variable>"
)
)

agent = PiNewsLetterAgent(config, [DuckDuckGo()])
agent.agent.print_response("Summarise the last week news for raspberry pi.", markdown=True)

Step 4: Running the Agent

Run your agent to generate the weekly news:

python main.py

Result

Today is June 22, 2024, and the results were generated at 9 am BST. The following results list several impressive news items about the Raspberry Pi, which include the publish date, summary of the article and the source link (links disappeared when copied from terminal). However, you may notice that they are not entirely satisfactory. This is because we used a generic prompt and search tool instead of a customised tailored to my specific interests, such as DIY projects with camera modules and new hardware releases. In upcoming posts of these series, we will explore various solutions to improve the results, including more targeted prompts and specialised tools.

╭──────────┬─────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ Message │ Summarise the last week news for raspberry pi. │
├──────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ Response │ Weekly Raspberry Pi News Summary │
│ (9.8s) │ │
│ │ 1. Raspberry Pi for Beehive Health Monitoring │
│ │ │
│ │ • Date: June 21, 2024 │
│ │ • Summary: Researchers at the Federal University of Itajubá in Brazil, led by Prof. José Alberto Ferreira │
│ │ Filho and master's student José Anderson Reis, are utilizing a Raspberry Pi Zero 2 W combined with AI to │
│ │ monitor and maintain the health of bee hives. This project aims to support local bee populations, │
│ │ crucial for ecosystem balance. │
│ │ • Source: Yahoo │
│ │ │
│ │ 2. Candy Sorting with Raspberry Pi Pico │
│ │ │
│ │ • Date: June 20, 2024 │
│ │ • Summary: A new project by Techtronic3D uses a Raspberry Pi Pico to create a candy sorter that can │
│ │ automatically classify candies by color and distribute them into corresponding bins. This project │
│ │ showcases the versatility of the Raspberry Pi Pico in creating fun and educational DIY electronics. │
│ │ • Source: Yahoo │
│ │ │
│ │ 3. Choosing Between Arduino and Raspberry Pi │
│ │ │
│ │ • Date: June 19, 2024 │
│ │ • Summary: An informative piece discussing scenarios in which Arduino might be a better choice over │
│ │ Raspberry Pi. While Raspberry Pi offers more complex computing capabilities, Arduino is favored for │
│ │ simplicity and specific applications, particularly in embedded systems. │
│ │ • Source: MSN / SlashGear │
│ │ │
│ │ 4. Raspberry Pi with Neural Processors │
│ │ │
│ │ • Date: June 18, 2024 │
│ │ • Summary: Jeff Geerling has modified a Raspberry Pi 5 into a powerful mini-PC equipped with eight neural │
│ │ processors. This development highlights the continuous advancements and customizability in the Raspberry │
│ │ Pi ecosystem, pushing the boundaries of what's possible with DIY computing. │
│ │ • Source: Geeky Gadgets │
│ │ │
│ │ 5. Raspberry Pi's Stock Market Debut │
│ │ │
│ │ • Date: June 18, 2024 │
│ │ • Summary: Raspberry Pi's shares experienced a successful start as they soared on the first day of full │
│ │ trading. Since its founding by Eben Upton in 2012, the company has evolved from a mission to make │
│ │ computing accessible for youngsters to a significant player in the tech industry. │
│ │ • Source: MSN / Cambridge Independent │
│ │ │
│ │ These updates highlight the ongoing innovations, practical applications, and significant milestones within │
│ │ the Raspberry Pi community. │
╰──────────┴─────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

Conclusion

The agent design pattern is a powerful approach that ensures agents are scalable, autonomous, and self-contained base units. By focusing on task-specific design, agents can be fine-tuned to meet precise business requirements, reducing the margin for error and enhancing overall efficiency. For agent developers, this design isolates issues, simplifies debugging, and improves version control. This approach is especially vital in multi-agent systems, where performance and coordination are crucial, as small errors can be amplified throughout the communication chain.

In our comparison of LangChain, AutoGen, CrewAI, and PhiData, we highlighted the unique capabilities of each framework, demonstrating how they can be leveraged to build effective agents. Using PhiData, we showcased a simple yet practical example of an agent that generates weekly news for Raspberry Pi enthusiasts.

Looking ahead, we will continue to explore the capabilities and structures of agents and multi-agent systems. Our future publications will delve into more complex agents, integrating various tools to tackle a range of intriguing tasks. As AI technology advances, the role of well-designed agent frameworks will also evolve. We are committed to ongoing investigation and innovation in this field.

--

--

Binome
Binome

Published in Binome

Notes on GenAI and Cloud Infrastructure technologies

Craig Li, Ph.D
Craig Li, Ph.D

Written by Craig Li, Ph.D

Sr applied AI and ML lead in world leading investment bank. Passionate about AI and Data Engineering. Also work on some Raspberry Pi project when free.

No responses yet