How I built a Swarm of AI Agents with LangChain
This blog explains in detail how I built my project during the weekend (which you can too!), an Emergency Travel Response System. This system leverages LangGraph to create a swarm of specialized agents that collaborate to handle complex travel emergencies, LangSmith to keep a track of them, and OpenAI for the best context attention.
Code: here. ⭐?
Youtube Playlist : here
Lets hit the architecture first.
Agent Swarm Architecture
To visualize how the agents interact within the LangGraph swarm, consider the following diagram:
Diagram Description:
1. User Input: The emergency response process starts when a user provides input describing their emergency situation. This input could be a text message, voice input, or any other form of communication.
2. Emergency Coordinator (EC): The `EmergencyCoordinator` agent acts as the central hub and orchestrator of the entire system. It’s the first agent to receive the user’s input. Think of it as the 911 operator for travel emergencies.
3. Triage and Delegation: The EC’s primary responsibility is to analyze the user’s input (triage) and determine which specialized agents are best suited to handle different aspects of the emergency. It then delegates tasks to these specialist agents. For example:
- If the user reports chest pain, the EC will delegate to the `Medical Evacuation Specialist` (MED).
- If the user is in a disaster zone, the EC will delegate to the `Disaster Response Expert` (DE).
- If the user is concerned about security threats, the EC will delegate to the `Security Analyst` (SA).
The arrows labeled “Triage & Delegate” in the diagram represent this delegation process.
4. Specialized Agents: These are the agents with specific expertise. In the diagram, we see examples like:
Medical Evacuation Specialist (MED): Focuses on medical emergencies, assessing medical needs, arranging evacuations, etc.
Disaster Response Expert (DE): Specializes in natural disasters, providing guidance on evacuations, safety, etc.
Security Analyst (SA): Expert in security threats, providing safety advice and risk assessments.
There are other specialized agents in the github repo(Documentation Expert, Accommodation Finder, etc.), represented by “…”.
You can design whatever agent you want,just make sure to follow the syntax and keep an eye on the API credits!
5. Specialist Assistance: Once a specialist agent receives a delegated task, it uses its specific tools and knowledge to address that aspect of the emergency. For example:
- The MED agent might use tools to assess medical urgency and arrange medical transport.
- The DE agent might use tools to check travel advisories and identify evacuation routes.
- The SA agent might use tools to assess security risks and provide safety recommendations.
The arrows labeled “Medical Assistance”, “Disaster Guidance”, and “Safety Advice” represent the specialist agents providing their expertise and guidance back to the user (indirectly, usually through the EC).
6. Handoff and Collaboration: Agents in the swarm can also hand off tasks or information to each other. For example, a specialist agent might realize that another agent’s expertise is needed. The arrow “Handoff to Specialist” represents this inter-agent communication. Often, agents hand back to the `Emergency Coordinator` (“… → EC”) to coordinate the overall response and provide a consolidated answer to the user.
7. User Output: Finally, the system provides a coordinated response back to the user. This response is usually a summary of information and guidance gathered from the various specialist agents, orchestrated by the `Emergency Coordinator`. The user receives a comprehensive and well-informed answer to their emergency situation.
I have also add a follow up query to be entered by the user to check for AI replies, however to control the number of calls I have added a max retries limit cap in utils/invocation.py . Feel free to fork the repo and make the changes!
In essence, the diagram illustrates a hierarchical and collaborative agent architecture. The `Emergency Coordinator` acts as the central point of contact and orchestrator, while specialized agents provide deep expertise in their respective domains. LangGraph manages the communication and workflow between these agents, enabling a robust and efficient emergency response system.
By understanding this detailed breakdown of `main.py` and the agent swarm architecture, you should now have a comprehensive grasp of how the Emergency Travel Response System is built and how it works. Now lets get into the code!
Core Components
Main.py
Before diving into the code, let’s outline the key components and libraries that make up main.py. Understanding these will make the code walkthrough much clearer:
LangChain: Think of LangChain as the toolbox for building applications powered by language models. This project will use it extensively for:
- To integrate with OpenAI’s models.
- Creating the structure for our agents (defining their prompts and tools).
- Providing a high-level interface to work with language models.
LangGraph: Imagine LangGraph as the conductor of an orchestra. It’s built on top of LangChain and is specifically designed for creating complex, multi-agent systems. It helps us:
- Define workflows that involve multiple agents.
- Manage the state and interactions between agents in a structured and reliable way.
- Handle long-running conversations and agent collaborations.
This playlist was extremely resourceful for me to understand the default implementations of the create_react_agent used in the code: https://youtu.be/5h-JBkySK34?si=O8gqZxhW1Q_ADc6B
OpenAI : This is the “brain” of our agents. We’re using OpenAI’s GPT-4o model because it’s very good at(cheap too):
- Understanding complex instructions.
- Reasoning through problems.
- Generating human-quality text responses.
- Following instructions to use tools effectively.
Agent Definitions (`agents/agent_definitions.py`)
This is where we define each specialized agent in our system. Each agent is like a specialist doctor — they have their own area of expertise and tools. You can define them as you want and it this block:
handoff_tools = {
"coordinator": create_handoff_tool(agent_name="EmergencyCoordinator",
description="Return to the main coordinator for further assistance or to handle another aspect of the emergency"),
"medical": create_handoff_tool(agent_name="MedicalEvacuationSpecialist",
description="Transfer to the medical evacuation specialist for help with medical transport or evacuation"),
"disaster": create_handoff_tool(agent_name="DisasterResponseExpert",
description="Transfer to the disaster response expert for help with natural disasters, evacuations, and danger assessment"),
"business": create_handoff_tool(agent_name="BusinessContinuityAgent",
description="Transfer to the business continuity agent for urgent business travel arrangements"),
"security": create_handoff_tool(agent_name="SecurityAnalyst",
description="Transfer to the security analyst for risk assessment and safety recommendations"),
"logistics": create_handoff_tool(agent_name="LogisticsOperator",
description="Transfer to the logistics operator for complex transportation planning"),
"documentation": create_handoff_tool(agent_name="DocumentationExpert",
description="Transfer to the documentation expert for emergency visa/passport assistance"),
"accommodation": create_handoff_tool(agent_name="AccommodationFinder",
description="Transfer to the accommodation finder for emergency lodging assistance"),
"medical_advisor": create_handoff_tool(agent_name="MedicalAdvisor",
description="Transfer to the medical advisor for health guidance for travelers"),
"communication": create_handoff_tool(agent_name="CommunicationCoordinator",
description="Transfer to the communication coordinator for establishing reliable communication channels"),
"insurance": create_handoff_tool(agent_name="InsuranceSpecialist",
description="Transfer to the insurance specialist for emergency claims and coverage verification"),
"local_resources": create_handoff_tool(agent_name="LocalResourceLocator",
description="Transfer to the local resource locator for connecting with local emergency services")
}
- Role and responsibilities (defined in their prompts).
- Tools they can use.
- Name and personality.
Emergency Scenarios (`scenarios/emergency_scenarios.py`):
To test our system, we need example emergencies. This module provides predefined scenarios, like “medical emergency in Japan” or “lost passport in Italy”. These scenarios help us:
- Demonstrate how the system works in different situations.
- Test if all agents are working correctly together.
User can input what scenarios they would like to provide as astarting point interact with the system.
Utility Functions (`utils/`):
These are helper functions that make our code cleaner and more robust. We have modules for:
- invocation.py: Contains invoke_with_retry, a function that helps handle temporary errors when talking to the language model. If something goes wrong (like a network hiccup), it automatically tries again.
- formatting.py: Contains functions like pretty_print_response, print_scenario_menu, etc. These functions are all about making the output look nice and easy to read in the console.
- .env: This is a configuration file where we store sensitive information like API keys. It’s important to keep API keys secret and not hardcode them directly into our code. Using .env is a standard security practice.
Step-by-Step Breakdown of `main.py`
Let’s now walk through the code in `main.py` section by section, explaining each line and its purpose:
1. Imports and Environment Setup
import time
import os
from langchain_openai import ChatOpenAI
from langgraph.checkpoint.memory import InMemorySaver
from langgraph_swarm import create_swarm
from langgraph.store.memory import InMemoryStore
from dotenv import load_dotenv
from agents.agent_definitions import create_agents
from utils.formatting import pretty_print_response, print_scenario_menu, print_scenario_header, print_followup_header
from utils.invocation import invoke_with_retry
from scenarios.emergency_scenarios import get_scenarios
load_dotenv()
- Imported Modules: `time` module for pausing API calls, `os` module for interacting with the operating system, and `langchain_openai` module for using OpenAI’s chat models.
- Checkpointing Mechanism: `InMemorySaver` from `langgraph.checkpoint.memory` is used to store conversation progress in memory.
- Multi-Agent System Setup: `create_swarm` from `langgraph_swarm` is used to build the agent swarm workflow, enabling task handoff between agents.
- Data Storage: LangGraph uses `InMemoryStore` for storing conversation data.
- Environment Variables: `load_dotenv` is used to load environment variables from a `.env` file.
- Agent Creation: `create_agents` function creates specialized agents for the system.
- Import Functionality: Imports functions from modules like `scenarios.emergency_scenarios` and `openai`.
- Load Environment Variables: Calls `load_dotenv()` to load variables from the `.env` file.
- Purpose of Imports: Uses LangChain and LangGraph for the agent framework, OpenAI for the language model, and custom modules for scenario handling and utility functions.
2. Initialize the Language Model and Agents
model = ChatOpenAI(model="gpt-4o", temperature=0.2)
agents = create_agents(model)
This line creates an instance of the `ChatOpenAI` class, which we imported earlier. This object is our connection to the OpenAI GPT-4 model.
GPT-4 model is used for its reasoning and text generation capabilities. Lower temperature (0.2) ensures focused, deterministic, and reliable responses for the emergency response system.
Here, we call the create_agents function that we imported from agents/agent_definitions.py.
- We pass the model object (our GPT-4 instance) as an argument to create_agents.
- The create_agents function uses this model to instantiate and configure all the individual agents in our swarm (like EmergencyCoordinator, MedicalEvacuationSpecialist, etc.). Each agent will use this same GPT-4 model to power its reasoning and response generation.
3. LangGraph Setup: Checkpoint, Store, and Swarm Workflow
checkpointer = InMemorySaver()
store = InMemoryStore()
workflow = create_swarm(
list(agents.values()),
default_active_agent="EmergencyCoordinator"
)
As explained earlier, LangGraph uses checkpointing to save the state of the workflow. `InMemorySaver()` creates a checkpoint saver that stores checkpoints in memory. This is suitable for development and examples but not for production because the checkpoints are lost if the program exits or crashes.
Similarly, `InMemoryStore()` creates an in-memory message store for LangGraph. This store holds the messages exchanged between agents and the overall conversation history. Like `InMemorySaver`, it’s for development purposes.
The workflow code used in my project is the core LangGraph function call that sets up our agent swarm.
Link to learn more about memories: https://langchain-ai.github.io/langgraphjs/concepts/memory/#what-is-memory
default_active_agent=”EmergencyCoordinator”: This is a crucial parameter. It tells LangGraph which agent should start processing each new user request. We set it to `”EmergencyCoordinator”` because, in our system, the `EmergencyCoordinator` is designed to be the entry point for all emergency requests. It’s responsible for triaging the request and then delegating to the appropriate specialist agents.
Why LangGraph setup like this? LangGraph is designed to manage complex, stateful workflows. Checkpointing and message stores are fundamental to its operation. `create_swarm` is the specific function for setting up a multi-agent collaboration where agents can hand off tasks to each other. Setting the `default_active_agent` is essential for defining the starting point of the workflow.
4. Compile the LangGraph App
app = workflow.compile(checkpointer=checkpointer, store=store)
This line “compiles” our LangGraph workflow definition (`workflow`) into an executable application (`app`) and takes the workflow we defined in the previous step and prepares it for execution. It essentially sets up all the internal wiring and connections needed for the agents to interact according to the workflow.
Why compile the workflow? Compiling the workflow is a step that LangGraph requires to optimize and prepare the workflow for execution. It’s similar to compiling code in programming languages — it takes the high-level description of the workflow and turns it into something that can be efficiently run. This step also links the workflow to the specified checkpointing and storage mechanisms.
5. Load Emergency Scenarios and Present Menu
scenarios = get_scenarios()
print_scenario_menu(scenarios)
while True:
choice = input("\nSelect a scenario number (1–6): ")
if choice in scenarios:
break
print("Invalid choice. Please select a number between 1 and 6.")
selected = scenarios[choice]
config = {"configurable": {"thread_id": f"emergency-{choice}"}}
print_scenario_header(choice, selected)
The code retrieves predefined emergency scenarios, displays them to the user, and prompts for a selection. The selected scenario is then configured with a unique identifier for LangGraph, which manages conversation history and state. This approach simplifies testing and interaction with the emergency response system.
6. Scenario Execution and User Interaction Loop
try:
turn_1 = invoke_with_retry(
app,
{"messages": [{"role": "user", "content": selected['initial']}]},
config
)
pretty_print_response(1, turn_1)
print_followup_header(selected['followup'])
turn_2 = invoke_with_retry(
app,
{"messages": [{"role": "user", "content": selected['followup']}]},
config
)
pretty_print_response(2, turn_2)
additional_input = input("\nWould you like to ask a follow-up question? (y/n): ")
if additional_input.lower() == 'y':
user_followup = input("\nEnter your follow-up question: ")
print_followup_header(f"USER: {user_followup}")
turn_3 = invoke_with_retry(
app,
{"messages": [{"role": "user", "content": user_followup}]},
config
)
pretty_print_response(3, turn_3)
except Exception as e:
print(f"Error processing scenario: {str(e)}")
print("Try checking your API key, model availability, and network connection.")```
The code executes a selected emergency scenario, simulating a multi-turn conversation with the user. It handles errors gracefully and provides informative messages to the user if something goes wrong.
7. Entry Point
python:main.py
if __name__ == “__main__”:
main()
This is a best practice in Python to make scripts reusable as modules. It allows functions and classes to be imported and used by other scripts, while also having a main execution block for direct execution. In our case, it ensures the `main()` function sets up and runs the emergency response system when `main.py` is run.
Step-by-Step Breakdown of Agents and Tools
1: Setup Agents
Build reusable tools that agents can call to perform specific tasks.
from datetime import datetime
def assess_medical_urgency(symptoms: str) -> dict:
urgency = "CRITICAL" if "chest pain" in symptoms.lower() else "ROUTINE"
return {
"urgency_level": urgency,
"assessment_time": datetime.now().isoformat(),
"recommendation": f"{urgency} medical situation detected."
}
def check_travel_advisory(country: str) -> dict:
advisories = {"japan": "EXERCISE NORMAL PRECAUTIONS",
"egypt": "EXERCISE INCREASED CAUTION"}
return {
"country": country,
"advisory_level": advisories.get(country.lower(), "UNKNOWN"),
"as_of_date": datetime.now().strftime("%Y-%m-%d")
}
def find_emergency_accommodation(location: str, num_people: int) -> dict:
return {
"location": location,
"option": f"Emergency Shelter for {num_people} in {location}",
"contact": "emergency@example.org"
}
def check_visa_requirements(citizenship: str, destination: str, purpose: str) -> dict:
return {
"required": "Yes" if purpose == "medical" else "Standard",
"processing_time": "24-48 hours" if purpose == "medical" else "5-10 days"
}
These tools simulate real-world actions:
- assess_medical_urgency: Evaluates symptom severity.
- check_travel_advisory: Provides travel warnings.
- find_emergency_accommodation: Locates lodging.
- check_visa_requirements: Checks visa needs.
2. Define the Agent Factory
from langchain_openai import ChatOpenAI
from langgraph.prebuilt import create_react_agent
from langgraph_swarm import create_handoff_tool
from tools.emergency_tools import assess_medical_urgency, check_travel_advisory, find_emergency_accommodation, check_visa_requirements
def create_agents(model: ChatOpenAI) -> dict: # Handoff tool for returning to coordinator
handoff = create_handoff_tool("EmergencyCoordinator", "Return to coordinator for further assistance") # Emergency Coordinator
coordinator = create_react_agent(
model,
[create_handoff_tool("MedicalEvacuationSpecialist", "Handle medical emergencies"),
create_handoff_tool("SecurityAnalyst", "Handle security threats")],
prompt="""You are the Emergency Coordinator. Triage requests, gather key details (e.g., location, symptoms),
and delegate to specialists. Be quick and compassionate.""",
name="EmergencyCoordinator"
)
# Medical Evacuation Specialist
medical_agent = create_react_agent(
model, [assess_medical_urgency, handoff],
prompt="""You are the Medical Evacuation Specialist. Assess symptoms, arrange transport, and prioritize safety. Ask for symptoms and location.""",
name="MedicalEvacuationSpecialist"
)
# Security Analyst
security_agent = create_react_agent(
model,
[check_travel_advisory, handoff],
prompt="""You are the Security Analyst. Evaluate threats, provide safety advice, and check travel advisories. Ask for location.""",
name="SecurityAnalyst"
)
return {
"EmergencyCoordinator": coordinator,
"MedicalEvacuationSpecialist": medical_agent,
"SecurityAnalyst"
: security_agent
}
- Agent Creation: Creates three agents: Emergency Coordinator, Medical Evacuation Specialist, and Security Analyst.
- Agent Functionality: Each agent has specific roles and responsibilities, such as triaging requests, assessing medical urgency, and handling security threats.
- Handoff Mechanism: Agents can hand off tasks to each other using a handoff tool, ensuring seamless collaboration and information flow.
- Role of Security Analyst: Evaluates threats, provides safety advice, and checks travel advisories. ( Example agent )
- Location Requirement: Security Analyst needs the location for evaluation. ( Can be configured during production )
3. Integrate Agents into a Swarm
Purpose: Combine agents into a collaborative swarm.
from langchain_openai import ChatOpenAI
from langgraph_swarm import create_swarm
from langgraph.checkpoint.memory import InMemorySaver
from dotenv import load_dotenv
from agents.agent_definitions import create_agents
load_dotenv()
model = ChatOpenAI(model="gpt-4", temperature=0.2)
agents = create_agents(model) # Create and compile the swarm
workflow = create_swarm(
list(agents.values()),
default_active_agent="EmergencyCoordinator"
)
app = workflow.compile(checkpointer=InMemorySaver())
- Purpose: Combine agents into a collaborative swarm.
- Action: Update main.py with code to create a swarm of agents.
- Explanation: The create_swarm function links agents, starting with the Coordinator. InMemorySaver tracks conversation state.
4: Add More Agents and Tools (Optional)
Expand the swarm with additional capabilities if you would like. Just fork the repo and make your changes.
To Summarize above:
- Agents: Specialized roles (Coordinator, Medical, Security, etc.) with unique prompts and tools.
- Tools: Functions that return structured data (e.g., dictionaries) for agents to act on.
- Interaction: Coordinator delegates via handoff tools; specialists use domain-specific tools.
- Pitfalls:
- Vague prompts → Agents misinterpret tasks.
- Missing tools → Agents fail to act.
- No handoff → Agents can’t collaborate.
Repetition in Agents and Tools Code
You might notice that in `agents/agent_definitions.py` and `tools/emergency_tools.py`, there’s a pattern in how agents and tools are defined. This is intentional and reflects a modular and scalable design.
Why define agents and tools in a repetitive pattern?
- Specialization and modularity make the system manageable and extensible. Agents are specialists in specific areas of emergency response, and tools perform specific actions. This allows for easy addition of new capabilities without modifying existing ones. Clear roles and responsibilities are defined by assigning prompts and tools to agents, making it easier to understand their contributions. Scalability is achieved by adding more specialized agents and tools, following the modular design. Reusability is promoted by creating general-purpose tools that can be reused by multiple agents.
For example, the repetitive structure in `agents/agent_definitions.py` where each agent is created using `create_react_agent` with a specific prompt, tools, and name, is a deliberate design choice to enforce modularity and make it easy to add, modify, or remove agents.
The same applies to the tools in `tools/emergency_tools.py`, where each tool is a separate function with a clear purpose and documentation.
Conclusion
This project is the starting point for a super cool emergency travel response system and a starter for your path to AI Agents and workflow automation. It uses LangChain and LangGraph to bring together a team of special agents, each powered by OpenAI’s GPT-4 model. These agents work together to handle tricky emergency situations.
LangGraph helps them manage workflows, save checkpoints, and use specialized agents. Integrated with LangSmith, you can see what, where and how something went wrong.
This system is super robust, modular, and scalable, so it can handle any emergency that comes its way. The modular design makes it easy to keep the system up-to-date and add new features. This detailed walkthrough will show you how the code works, what it’s for, and the basics of building a multi-agent system with LangGraph. Now you’ll have a solid understanding of how the AI Swarm is built and how it works!