π Understanding Model Context Protocol: A Deep Dive into Multi-Server LangChain Integration
π Introduction
In the quest to harness the full capabilities of Large Language Models (LLMs), developers often encounter a significant challenge: accessing the myriad of unique APIs provided by popular tools like Slack, GitHub, or your own local filesystem. While LLMs can connect to these APIs using various tools, the process typically involves writing code to establish these connections.
For those using desktop applications such as Cursor or Claude Desktop, the situation can be even more restrictive, as manual addition of new tools is not feasible.
Imagine a world where you could seamlessly access a set of pre-built tools specifically designed to integrate with your existing desktop applications. Enter the Model Context Protocol (MCP), a revolutionary approach that allows you to create customizable toolsets that plug directly into your apps.
ππ οΈMCP enables the specification of various commands for LLMs, such as βfetch repositoryβ for GitHub or βcomment on PR,β providing a frictionless experience. Whether through established desktop applications or your own software, MCP clients can easily communicate with servers to activate these tools.
As marketplaces for MCP servers rapidly emerge, countless companies are eager to introduce their solutions. This soon-to-be reality will empower you to choose from a diverse array of ready-made tools, enhancing your LLM workflows and driving efficiency in your projects. πβ¨
π What is Model Context Protocol (MCP)?
Model Context Protocol (MCP) is a stateful, context-preserving framework designed to power intelligent, multi-step interactions between humans and AI agents. Unlike traditional API calls that treat each request as an isolated event, MCP introduces a persistent, evolving context layer π that allows AI systems to retain memory, learn dynamically, and act autonomously over time.
According to modelcontextprotocol.io, MCP is built on three pillars:
- Statefulness π§ : Maintains session-specific and long-term memory.
- Interoperability π: Works seamlessly across models, tools, and data sources.
- Agent-Centric Design π€: Prioritizes autonomous decision-making within defined boundaries.
Example:
An MCP-powered travel agent π§³ remembers your budget, allergies, and past trip feedback across conversations to plan a personalized itinerary β no need to repeat yourself!
Core Architecture of the Model Context Protocol (MCP) ποΈπ
Understanding the Connection Between Clients, Servers, and LLMs
The Model Context Protocol (MCP) features a flexible and extensible architecture that facilitates seamless communication between LLM applications and integrations. This section outlines the core architectural components and concepts.
Overview π
MCP operates on a client-server architecture where:
- Hosts: LLM applications (such as Claude Desktop or Integrated Development Environments) that initiate connections.
- Clients: Maintain 1:1 connections with their corresponding servers, operating within the host application.
- Servers: Provide essential context, tools, and prompts to clients.
Server Process βοΈ
The server process involves the following components:
- Host: An application that hosts the interaction.
- Transport Layer: The layer responsible for facilitating communication between clients and servers.
- MCP Client: The entity that communicates with the MCP server.
- MCP Server: The component that manages tools, context, and handles requests from the MCP Client.
This architecture ensures that LLMs can effectively access and utilize tools and resources, enhancing their capabilities in various applications. πβ¨
Core Components of the Model Context Protocol (MCP) π οΈπ
Protocol Layer π‘
The protocol layer is responsible for message framing, linking requests with responses, and defining high-level communication patterns.
Key Classes Include:
- Protocol
- Client
- Server
Transport Layer π
The transport layer manages the actual communication between clients and servers, supporting multiple transport mechanisms:
- Stdio Transport: Utilizes standard input/output for communication, making it ideal for local processes.
- HTTP with SSE Transport: Employs Server-Sent Events for server-to-client messages and HTTP POST for client-to-server communications.
All transport mechanisms leverage JSON-RPC 2.0 for message exchanges. Refer to the specification for detailed information about the Model Context Protocol message format.
Message Types π¨
MCP defines several key message types:
- Requests: Expect a response from the receiving side.
- Results: Indicate successful responses to requests.
- Errors: Indicate that a request has failed.
πHow MCP Works: A Technical Deep Dive
1. Context Window Management
MCP uses a dynamic context window that grows with each interaction, storing:
- User preferences (e.g., language, tone).
- Conversation history (prior queries/responses).
- Environmental data (e.g., device type, location π).
2. Context Embedding & Compression
To avoid overload, MCP compresses non-critical data into embeddings (e.g., summarizing a 10-message chat into a intent vector π’) while retaining key details.
3. Stateful Workflows
MCP enables multi-step workflows where agents:
- Remember past actions (e.g., βUser already uploaded their IDβ).
- Adapt strategies (e.g., switching from email to SMS if the user is offline π΄).
- Self-correct using feedback (e.g., βUser disliked Option A; prioritize Option Bβ).
# Hypothetical MCP stateful workflow (from official docs)
class TravelAgent(MCPAgent):
def __init__(self, user_id):
self.context = load_context(user_id) # Load past interactions
self.preferences = self.context.get("preferences", {})
def book_flight(self, query):
if self.preferences.get("class") == "economy":
return search_flights(query, budget=True)
else:
return search_flights(query)
Why Not Just Give the LLM Access to the API? π€
A common question regarding the Model Context Protocol (MCP) is: βWhy do we need a custom protocol? Canβt LLMs simply learn to use APIs on their own?β
In theory, the answer may be affirmative. Most public APIs come with documentation that outlines their functionalities. One could provide this documentation to an LLM, enabling it to derive the necessary steps to achieve its objectives.
However, in practice, this method is often inefficient. As developers focused on enhancing user experience, itβs essential to prioritize speed and responsiveness. By presenting tools to the LLM in an easily consumable format, we significantly reduce latency and streamline the overall process. This approach ensures a smoother interaction and quicker results for users. ποΈπ¨
π MCP vs. Traditional API Calls: A Game Changer
Why the shift? APIs are like snapshots πΈ β great for static tasks. MCP is a video π₯, capturing the full narrative of user intent.
Isnβt This Just Tool Calling? π€π§
A question I often encounter regarding the Model Context Protocol (MCP) is, βHow does this differ from tool calling?β
Tool calling refers to the mechanism by which LLMs invoke functions to perform tasks in the real world. In this setup, the LLM operates alongside a tool executor that calls the specified tools and returns the results. The typical process looks like this:
- π Describe Tool to Be Called
- π€ Send Result
- π€ LLM
- βοΈ Tool Executor
However, this interaction typically occurs within the same environment, whether that be a single server or a specific desktop application.
In contrast, the MCP framework enables LLMs to access tools from a separate process, which can be either local or hosted on a remote server. The structure is as follows:
- π MCP Server
- π₯οΈ MCP Client
- π Describe Tool to Be Called
- π§ Call Tool
- π€ Send Result
- π Return Result
- π€ LLM
- π MCP Protocol
- βοΈ Tool Executor
The key distinction lies in the complete decoupling of the MCP server from the client. This separation offers greater flexibility and scalability, enhancing how LLMs interact with external tools. πβ¨
π Why MCP is Revolutionary for Agentic AI
Agentic frameworks require AI to act autonomously, not just respond. MCPβs importance lies in:
1. Enabling True Autonomy
Agents can now:
- Make decisions based on historical data (e.g., a healthcare agent π₯ recalling a patientβs allergy list).
- Chain tasks without human intervention (e.g., βResearch β Draft β Edit β Publishβ blog posts βοΈ).
2. Collaborative Intelligence
MCP allows agents to share context with:
- Other agents (e.g., a customer service bot π€ escalating to a human agent π©π»).
- External tools (e.g., pulling real-time stock data π into a financial advisorβs response).
3. Ethical Guardrails
- Auditability: Full context history helps trace biased/inaccurate outputs.
- Privacy: Sensitive data (e.g., medical records) is compartmentalized π.
Without MCP, agents would lack continuity β like a chef π§π³ forgetting recipe steps midway!
4. Enables Long-Term Autonomy
- Persistent Memory: Agents remember user preferences (e.g., βAlex hates spam emails π§β).
- Goal Chaining: Execute multi-step tasks (e.g., Research β Negotiate β Book a business trip βοΈ)
Connection Lifecycle π
The connection lifecycle within MPI is essential for managing the states and transitions of interactions between clients and servers, ensuring robust communication and functionality throughout the process.
This structured approach to components within the MCP provides a clear framework for efficient communication and integration, empowering LLM applications to thrive. πβ¨
1. Initialization π
During the initialization phase, the following steps occur between the server and client:
- Client sends an initialize request that includes the protocol version and its capabilities.
- Server responds with its own protocol version and capabilities.
- Client sends an initialized notification to acknowledge successful connection establishment.
- The connection is now ready for use, and normal message exchange begins.
2. Message Exchange π¨
After the initialization phase, the MCP supports the following communication patterns:
- Request-Response: Either the client or server can send requests, to which the other party will respond.
- Notifications: Either side may send one-way messages without expecting a response.
3. Termination π
The connection can be terminated by either party, which can occur in several ways:
- Clean Shutdown: Achieved through the
close()
method. - Transport Disconnection: Occurs when communication channels are lost.
- Error Conditions: Termination may also result from encountering errors.
Error Handling β
MCP defines a set of standard error codes to effectively manage issues that may arise:
enum ErrorCode {
// Standard JSON-RPC error codes
ParseError = -32700,
InvalidRequest = -32600,
MethodNotFound = -32601,
InvalidParams = -32602,
InternalError = -32603
}
Additionally, SDKs and applications can define their own custom error codes starting from -32000.
Error Propagation:
Errors are communicated through:
- Error Responses: Returned in response to requests with issues.
- Error Events: Triggered on transports to notify of errors.
- Protocol-Level Error Handlers: Manage errors at the MCP level.
This structured lifecycle ensures robust and efficient communication between clients and servers while gracefully handling errors as they arise. ππΌ
ποΈ Code Implementation
The implementation showcases a sophisticated multi-server setup that handles different types of operations through various transport protocols. Hereβs a visual representation of the system:
Install required Dependencies
pip install mcp httpx langchain langchain-core langchai-community langchain-groq langchain-ollama langchain_mcp_adapters
Set up groq api key in .env file
import os
from dotenv import load_dotenv
load_dotenv()
Create Servers
Math Server
# math_server.py
from mcp.server.fastmcp import FastMCP
mcp = FastMCP("Math")
@mcp.tool()
def add(a: int, b: int) -> int:
"""Add two numbers"""
return a + b
@mcp.tool()
def multiply(a: int, b: int) -> int:
"""Multiply two numbers"""
return a * b
if __name__ == "__main__":
mcp.run(transport="stdio")
Weather Server(weather.py)
from typing import Any
import httpx
from mcp.server.fastmcp import FastMCP
# Initialize FastMCP server
mcp = FastMCP("weather")
# Constants
NWS_API_BASE = "https://api.weather.gov"
USER_AGENT = "weather-app/1.0"
async def make_nws_request(url: str) -> dict[str, Any] | None:
"""Make a request to the NWS API with proper error handling."""
headers = {
"User-Agent": USER_AGENT,
"Accept": "application/geo+json"
}
async with httpx.AsyncClient() as client:
try:
response = await client.get(url, headers=headers, timeout=30.0)
response.raise_for_status()
return response.json()
except Exception:
return None
def format_alert(feature: dict) -> str:
"""Format an alert feature into a readable string."""
props = feature["properties"]
return f"""
Event: {props.get('event', 'Unknown')}
Area: {props.get('areaDesc', 'Unknown')}
Severity: {props.get('severity', 'Unknown')}
Description: {props.get('description', 'No description available')}
Instructions: {props.get('instruction', 'No specific instructions provided')}
"""
@mcp.tool()
async def get_alerts(state: str) -> str:
"""Get weather alerts for a US state.
Args:
state: Two-letter US state code (e.g. CA, NY)
"""
url = f"{NWS_API_BASE}/alerts/active/area/{state}"
data = await make_nws_request(url)
if not data or "features" not in data:
return "Unable to fetch alerts or no alerts found."
if not data["features"]:
return "No active alerts for this state."
alerts = [format_alert(feature) for feature in data["features"]]
return "\n---\n".join(alerts)
@mcp.tool()
async def get_forecast(latitude: float, longitude: float) -> str:
"""Get weather forecast for a location.
Args:
latitude: Latitude of the location
longitude: Longitude of the location
"""
# First get the forecast grid endpoint
points_url = f"{NWS_API_BASE}/points/{latitude},{longitude}"
points_data = await make_nws_request(points_url)
if not points_data:
return "Unable to fetch forecast data for this location."
# Get the forecast URL from the points response
forecast_url = points_data["properties"]["forecast"]
forecast_data = await make_nws_request(forecast_url)
if not forecast_data:
return "Unable to fetch detailed forecast."
# Format the periods into a readable forecast
periods = forecast_data["properties"]["periods"]
forecasts = []
for period in periods[:5]: # Only show next 5 periods
forecast = f"""
{period['name']}:
Temperature: {period['temperature']}Β°{period['temperatureUnit']}
Wind: {period['windSpeed']} {period['windDirection']}
Forecast: {period['detailedForecast']}
"""
forecasts.append(forecast)
return "\n---\n".join(forecasts)
if __name__ == "__main__":
# Initialize and run the server
mcp.run(transport='sse')
Create client
langchain_mcp_multiserver.py
import asyncio
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
from IPython.display import display, Markdown
from langchain_core.messages import HumanMessage, ToolMessage, AIMessage
from langchain_mcp_adapters.tools import load_mcp_tools
from langgraph.prebuilt import create_react_agent
from langchain_mcp_adapters.client import MultiServerMCPClient
from langgraph.prebuilt import create_react_agent
#from langchain_azure_ai.chat_models import AzureAIChatCompletionsModel
from langchain_groq import ChatGroq
from langchain_ollama import ChatOllama
from langchain_core.prompts import PromptTemplate
from dotenv import load_dotenv
load_dotenv()
#model = ChatGroq(model="llama-3.3-70b-versatile",temperature=0.5)
model = ChatOllama(model="llama3.2:1b",temperature=0.0,max_new_tokens=500)
server_params = StdioServerParameters(
command="python",
# Make sure to update to the full absolute path to your math_server.py file
args=["weather.py"],
)
async def run_app(user_question):
async with MultiServerMCPClient(
{
"weather": {
"url": "http://localhost:8000/sse",
"transport": "sse",
},
"math": {
"command": "python",
# Make sure to update to the full absolute path to your math_server.py file
"args": ["math_server.py"],
"transport": "stdio",
},
}
) as client:
agent = create_react_agent(model, client.get_tools())
agent_response = await agent.ainvoke({"messages": user_question})
print(agent_response['messages'][-1].content)
# # Stream the response chunks
# async for chunk in agent.astream({"messages": user_question}):
# # Extract the message content from the AddableUpdatesDict structure
# if 'agent' in chunk and 'messages' in chunk['agent']:
# for message in chunk['agent']['messages']:
# if isinstance(message, AIMessage):
# # Handle different content formats
# if isinstance(message.content, list):
# # For structured content with text and tool use
# for item in message.content:
# if isinstance(item, dict) and 'text' in item:
# print(f"**AI**: {item['text']}")
# else:
# # For simple text content
# print(f"**AI**: {message.content}")
# elif 'tools' in chunk and 'messages' in chunk['tools']:
# for message in chunk['tools']['messages']:
# if hasattr(message, 'name') and hasattr(message, 'content'):
# # Display tool response
# print(f"**Tool ({message.name})**: {message.content}")
return agent_response['messages'][-1].content
if __name__ == "__main__":
#user_question = "what is the weather in california?"
#user_question = "what's (3 + 5) x 12?"
#user_question = "what's the weather in seattle?"
user_question = "what's the weather in NYC?"
response = asyncio.run(run_app(user_question=user_question))
print(response)
Prior to invoking the client make sure that the weather server is up and running
python weather.py
Invoke the client
python langchain_mcp_multiserver.py
Response: βwhatβs (3 + 5) x 12?β
The result of (3 + 5) is 8, and 8 x 12 is 96.
Resposne : βwhatβs the weather in NYC?β
It appears you've provided a list of weather alerts from the National Weather Service (NWS) for various regions in New York State, Vermont, and parts of Massachusetts.
Here's a breakdown of what each alert is saying:
**Flooding Alerts**
* The NWS has issued several flood watches across New York State, including:
+ Northern St. Lawrence; Northern Franklin; Eastern Clinton; Southeastern St. Lawrence; Southern Franklin; Western Clinton; Western Essex; Southwestern St. Lawrence; Grand Isle; Western Franklin; Orleans; Essex; Western Chittenden; Lamoille; Caledonia; Washington; Western Addison; Orange; Western Rutland; Eastern Franklin; Eastern Chittenden; Eastern Addison; Eastern Rutland; Western Windsor; Eastern Windsor
+ Northern Herkimer; Hamilton; Southern Herkimer; Southern Fulton; Montgomery; Northern Saratoga; Northern Warren; Northern Washington; Northern Fulton; Southeast Warren; Southern Washington; Bennington; Western Windham; Eastern Windham
* The NWS has also issued a flood watch for parts of Vermont, including:
+ Northern New York and northern and central Vermont
**Ice Jam Alerts**
* The NWS has warned about the possibility of ice jams in several areas, including:
+ Bennington; Western Windham; Eastern Windham
+ Southern Vermont, Bennington and Windham Counties
+ Central New York, Herkimer County
+ Northern New York, Hamilton, Montgomery, Fulton, Herkimer, Warren, Washington Counties
**Other Alerts**
* The NWS has issued several warnings about heavy rainfall and snowmelt leading to minor river flooding.
* There are also alerts for isolated ice jams that could further increase the flood risk.
It's essential to stay informed about weather conditions in your area and follow the instructions of local authorities. If you're planning outdoor activities, be prepared for changing weather conditions and take necessary precautions to stay safe.
It appears you've provided a list of weather alerts from the National Weather Service (NWS) for various regions in New York State, Vermont, and parts of Massachusetts.
Here's a breakdown of what each alert is saying:
**Flooding Alerts**
* The NWS has issued several flood watches across New York State, including:
+ Northern St. Lawrence; Northern Franklin; Eastern Clinton; Southeastern St. Lawrence; Southern Franklin; Western Clinton; Western Essex; Southwestern St. Lawrence; Grand Isle; Western Franklin; Orleans; Essex; Western Chittenden; Lamoille; Caledonia; Washington; Western Addison; Orange; Western Rutland; Eastern Franklin; Eastern Chittenden; Eastern Addison; Eastern Rutland; Western Windsor; Eastern Windsor
+ Northern Herkimer; Hamilton; Southern Herkimer; Southern Fulton; Montgomery; Northern Saratoga; Northern Warren; Northern Washington; Northern Fulton; Southeast Warren; Southern Washington; Bennington; Western Windham; Eastern Windham
* The NWS has also issued a flood watch for parts of Vermont, including:
+ Northern New York and northern and central Vermont
**Ice Jam Alerts**
* The NWS has warned about the possibility of ice jams in several areas, including:
+ Bennington; Western Windham; Eastern Windham
+ Southern Vermont, Bennington and Windham Counties
+ Central New York, Herkimer County
+ Northern New York, Hamilton, Montgomery, Fulton, Herkimer, Warren, Washington Counties
**Other Alerts**
* The NWS has issued several warnings about heavy rainfall and snowmelt leading to minor river flooding.
* There are also alerts for isolated ice jams that could further increase the flood risk.
It's essential to stay informed about weather conditions in your area and follow the instructions of local authorities. If you're planning outdoor activities, be prepared for changing weather conditions and take necessary precautions to stay safe.
Note here the MCP Client is able to make connections to respective servers based on the question asked. We did not explicitly mention any routing logic.
π The Future of MCP
Per the official docs, upcoming features include:
- Cross-Platform Context Sync: Unify context across apps (e.g., your email βοΈ, Slack π¬, and CRM π).
- Context-Aware Security: Dynamically adjust permissions based on user behavior.
- Self-Optimization: Agents refine their own context management rules π.
π‘ Key Takeaways
- MCP > APIs for complex, evolving tasks.
- Critical for Agents π€ to act proactively, not just reactively.
- Balances Power & Safety via auditable, compartmentalized context.
π― Use Cases
This architecture is particularly useful for:
- Multi-modal AI applications
- Complex workflow orchestration
- Distributed AI systems
- Real-time data processing applications
π Conclusion
The Model Context Protocol implementation demonstrated here showcases a robust, scalable, and flexible architecture for building sophisticated AI applications. Its modular design and support for multiple transport protocols make it an excellent choice for complex AI systems.
References
Note: This experimentation as conducted by following available online resources.