Design LLM-Based Agents: Key Principles — Part 2
In this post we will continually discuss LLM-based AI Agent design patterns, building on the foundations from Designing LLM-Based Agents: Key Principles — Part 1 . Previously, we introduced high-level AI Agent design patterns and highlighted the concept of agentic workflows. Now, we will take a closer look at the agentic workflow and demonstrate how to integrate software design patterns to make AI Agents robust, scalable, and observable.
Before diving into the design patterns, let’s clarify three important terms that will be used throughout this post:
- Agentic AI: A type of AI that can make decisions and take actions autonomously. It’s designed to adapt to changing environments and events, and can work with limited human supervision. Agentic AI includes complex behaviours like collaboration, problem-solving, and decision-making in unpredictable or open-ended environments. Think of Agentic AI as a estate agency. It is a system to reach out to potential clients, handle a variety of buying and selling scenarios, and generally interact fluidly with the market.
- AI Agent: An AI Agent is an autonomous software entity designed to perceive its environment, reason about it, and take actions to achieve specific goals. Think of an AI Agent as an employee in that estate agency — independently picking up calls, making decisions, and working to produce profit for the agency.
- Agentic Workflow: The Agentic Workflow is the structured process by which AI Agents operate and achieve their goals. It’s akin to the runbook an employee follows at estate agency. For example, when they pick up a call, there’s a procedure to understand the client’s needs before providing a suitable response.
In this post, we will focus on the agentic workflow because it forms the core of a successful AI Agent.
The Components of Agentic Workflow and its Design Patterns
The workflow orchestrates a set of task nodes that respond to incoming requests. From a software engineering standpoint, a good agentic workflow — particularly one that leverages LLMs — should be:
- Observable: Engineers should be able to see what is happening within the workflow at every step.
- Flexible: Workflow components should be decoupled, allowing developers to modify or replace them easily.
- Restorable: Because LLM calls and tool integrations can be expensive or occasionally fail, we want to be able to restore the workflow from the point of failure to save costs and simplify debugging.
From a software architecture standpoint, three key patterns help meet these requirements:
- Event-Driven Architecture: The workflow operates on a publish/subscribe (pub/sub) model, ensuring a decoupled structure. This design supports scalability, resilience, and modularity — critical qualities for modern systems.
- Event Sourcing Pattern: Every decision that a workflow component makes is recorded in an event store. This store forms an immutable log of all events — an invaluable resource for both observability and restorability. It also serves as the “memory” layer for the workflow.
- Command Pattern: By decoupling orchestration (which determines what needs to be done) from execution (the actual task), the system becomes significantly more reusable, testable, and flexible. For example, an LLM Command encapsulates the invocation of an LLM tool, such as OpenAI, Bedrock, Ollama, or others, based on the provided command context.
At the heart of event driven architecture is a workflow constructed from nodes and topics:
- Nodes: These are the functional units of the system. A node executes a specific tool through command in response to an event.
- Topics: These act as communication channels. Nodes produce messages to topics and consume messages from topics, enabling seamless communication.
Based on the pub/sub event driven architecture, a node publishes a message to a topic upon completing a command. Any node subscribed to that topic consumes the message and, when applicable, executes its corresponding command. This decoupled design forms the foundation for the system’s scalability, resilience, and modularity, enabling seamless communication and robust workflows.
Based on all above, here is the workflow design
Memory — A “Side Effect“ of Event Sourcing Pattern
In this post, “memory” refers specifically to how the agent retains the history of its conversation with the caller — distinct from the memory mechanisms typically found in transformer architectures.
One of the most powerful features of event sourcing is its immutable log of events, which effectively serves as a memory layer. Through this log, nodes can retrieve context-relevant information and maintain a coherent conversation across multiple interactions.:
- Instant Memory (Per Request): A node can look up relevant context from events generated within the current request. Each node in the directed graph aggregates the data from all of its ancestor nodes — i.e., any nodes from which it can be reached by following the graph’s edges backward — so that when the node executes, it has the complete set of data from every path leading into it.
- Short-Term Memory (Per Conversation): In conversational AI (e.g., chatbots), the system can maintain context by referencing the sequence of events that occurred in the current conversation.
- Long-term memory (or a knowledge base) can be generated from events too, and it’s another great topic we will address in a future post.
Note that the event store serves as the source of truth for the agent’s memory. Since each event is recorded at the tool level, an agent can assemble a relevant history of events to feed into an LLM node.
We see that adopting this design not only accelerates business growth through agile agent development, but also streamlines the engineering process by treating the agent as a standard service (Multi-Agent-as-a-Service — A Senior Engineer’s Overview) within a broader software ecosystem. By aligning with an established SDLC (software development life cycle), teams can iterate quickly, integrate new requirements seamlessly, and maintain a robust, scalable solution that meets evolving market demands. And the good news is, we’re already developing this agent framework. Here’s a quick look at what we’ve accomplished so far.
Build a ReAct Agent with Binome Agent Framework
The ReAct agent is a well-known agent design pattern. For a detailed explanation, refer to our post — AI Agent Workflow Design Patterns — An Overview. Below is a short code snippet illustrating how you might build a ReAct Agent using Binome Agent Framework:
# Create workflow builder
workflow_builder = EventDrivenWorkflow.Builder().name(
"ReActAssistantWorkflow"
)
# Create thought result topic
thought_result_topic = Topic(name="thought_result")
# Create observation result topic
observation_result_topic = Topic(name="observation_result")
"""
Build thought node, which either agent_input_topic or observation_result_topic
can trigger this node, then publish to thought_result_topic.
Given inputs, this node would use OpenAI to provide "thought".
"""
thought_node = (
LLMNode.Builder()
.name("ThoughtNode")
.subscribe(
SubscriptionBuilder()
.subscribed_to(agent_input_topic)
.or_()
.subscribed_to(observation_result_topic)
.build()
)
.command(
LLMResponseCommand.Builder()
.llm(
OpenAITool.Builder()
.name("ThoughtLLMTool")
.api_key(self.api_key)
.model(self.model)
.system_message("""
You are an AI assistant tasked with analyzing the user's question and considering the provided observation to determine the next logical step required to answer the question.
Your response should describe what would be the most effective action to take based on the information gathered.
If the information is sufficient to answer the question, return the answer with confirmation the answer is ready.
""")
.build()
)
.build()
)
.publish_to(thought_result_topic)
.build()
)
# Add thougt node to workflow.
workflow_builder.node(thougt_node)
# Create action result topics, each of them has its own publish condition.
action_result_search_topic = Topic(
name="action_search_result",
condition=lambda msgs: msgs[-1].function_call is not None,
)
action_result_finish_topic = Topic(
name="action_finish_result",
condition=lambda msgs: msgs[-1].content is not None
and msgs[-1].content.strip() != "",
)
"""
Create action node, which subscribed to thought_result_topic, and publish to
two action result topics.
This node uses OpenAI to decide finish the react loop or continue search the
website to find the user's question.
"""
action_node = (
LLMNode.Builder()
.name("ActionNode")
.subscribe(thought_result_topic)
.command(
LLMResponseCommand.Builder()
.llm(
OpenAITool.Builder()
.name("ActionLLMTool")
.api_key(self.api_key)
.model(self.model)
.system_message("""
You are an AI assistant responsible for executing actions based on a given plan to retrieve information.
Specify the appropriate action to take, such as performing a search query or accessing a specific resource, to gather the necessary data.
If answer is ready, return **FINISH REACT**.
""")
.build()
)
.build()
)
.publish_to(action_result_search_topic)
.publish_to(action_result_finish_topic)
.build()
)
# Add action node to workflow
workflow_builder.node(action_node)
# Create search result topic
search_function_result_topic = Topic(name="search_function_result")
# Create search function node, which subscribed to the action_result_search_topic
# This node will call search function
search_function_node = (
LLMFunctionCallNode.Builder()
.name("SearchFunctionNode")
.subscribe(action_result_search_topic)
.command(
FunctionCallingCommand.Builder().function_tool(self.search_tool).build()
)
.publish_to(search_function_result_topic)
.build()
)
# Add search function node to the workflow.
# The function information will be regiested to action node.
workflow_builder.node(search_function_node)
"""
Create observation node, which subscribed to the search function result topic,
and publish to observation result topic.
This node use OpenAI to generate "observation" from user input, thought -> action results.
"""
observation_node = (
LLMNode.Builder()
.name("ObservationNode")
.subscribe(search_function_result_topic)
.command(
LLMResponseCommand.Builder()
.llm(
OpenAITool.Builder()
.name("ObservationLLMTool")
.api_key(self.api_key)
.model(self.model)
.system_message("""
You are an AI assistant that records and reports the results obtained from executed actions.
After performing an action, provide a clear and concise summary of the findings relevant to the user's question.
""")
.build()
)
.build()
)
.publish_to(observation_result_topic)
.build()
)
# Add observation node to workflow
workflow_builder.node(observation_node)
"""
Create the summaries node, which use OpenAI to summarize all the results from all above
This node will produce the message to agent_output_topic, which will be consumed by agent
and then respond to user.
"""
summaries_node = (
LLMNode.Builder()
.name("SummariesNode")
.subscribe(action_result_finish_topic)
.command(
LLMResponseCommand.Builder()
.llm(
OpenAITool.Builder()
.name("SummariesLLMTool")
.api_key(self.api_key)
.model(self.model)
.system_message("""
You are an AI assistant tasked with summarizing the findings from previous observations to provide a clear and accurate answer to the user's question.
Ensure the summary directly addresses the query based on the information gathered.
""")
.build()
)
.build()
)
.publish_to(agent_output_topic)
.build()
)
# Add summaries_node to workflow
workflow_builder.node(summaries_node)
# Finally build the workflow
workflow = workflow_builder.build()
Below is the output workflow visualisation. Orange boxes represent topics, while navy-blue boxes represent nodes.
Observability is crucial for any AI system. Below is an example of how we use Phoenix by Arize to monitor real-time metrics and gain insights into our agents. The following screenshots illustrate the input of a “summarise” node, which includes historical data generated by various nodes. In sequence, the data points are: system message, user input, thought node, action node, search tool node, observation node, thought node, action node, and finally, the summarise node.
We hope you enjoy our work from above “glimpse”. And we are excited to announce that we plan to open-source our Agent Framework soon. Currently, we are polishing the codebase and adding more documentation, so stay tuned for future updates!
Conclusion
By combining event-driven architecture, event sourcing, and the command pattern, we create an agentic workflow that is robust, transparent, and easy to extend — key qualities for any modern AI system. This approach not only addresses practical challenges such as observability, scalability, and state restoration, but it also fosters agile development by treating AI Agents as standard services within a larger software ecosystem.
As we continue to refine and open-source our agent framework, we look forward to sharing more detailed examples, best practices, and real-world applications. Stay tuned for our upcoming posts, where we will dive deeper into advanced topics like long-term memory, conversation strategies, and more.