Multi-agent systems with MCP: Building AI teams that share tools
All about MCP, why having agents coordinating through shared infrastructure beats direct communication, and how to implement it with real-world examples.
Mar 18, 2026 • 15 Minute Read
- The NxM problem with direct agent communication
- Solving the NxM problem through shared infrastructure
- Shared infrastructure vs direct communication: The hospital ER analogy
- What Model Context Protocol (MCP) actually does
- When to use MCP alone for your multi-agent system
- When to add A2A to your solution
- The Complete Stack: MCP + A2A Together
- Deciding between MCP and A2A: When to use each
- The supervisor pattern
- Parallel Execution
- The security reality with MCP
- Real-world validation of MCP architecture
- What you can do now
You're building a customer support system. You have a triage agent that categorizes incoming tickets, a billing agent that handles payment issues, and a technical support agent that troubleshoots product problems. You want them working together, so you wire the agents with direct communication. Now, the triage agent sends tickets to billing or tech support, and the billing agent sends payment info to tech support when needed.
Good so far, right? But then, technical support needs billing context to understand why a customer is upset about charges. Now, it needs to talk to the billing agent too, and you add a sentiment agent to detect any customer feelings and frustration in real time. Suddenly, every agent needs to know about every other agent. You add a fifth agent for escalation.
The end result? You now have a web of connections that nobody can maintain.
The NxM problem with direct agent communication
One of the big problems with the scenario above is that every time you update one agent, you risk breaking its connections to other agents. Testing becomes a nightmare because you cannot test one agent in isolation. Debugging requires tracing messages across multiple agent boundaries. This creates an ironic situation where our original system, which was supposed to reduce ticket resolution time, is now creating its own tickets.
Five agents need ten connections. Add one more and you need fifteen. This is the N×M problem. BCG found that integration complexity rises quadratically without standardization, making five-agent systems roughly twenty-five times harder to maintain than single-agent setups
The consequences show up fast: context gets lost between handoffs, agents hallucinate about what other agents said, and teams spend more time debugging inter-agent communication than building actual functionality.
Microsoft Deputy CTO Sam Schillace captured this core problem: "To be autonomous you have to carry context through a bunch of actions, but the models are very disconnected and don't have continuity the way we do."
Solving the NxM problem through shared infrastructure
We've defined the problem, but what about the solution? The path forward is simple: your agents do not need to talk to each other, but should coordinate through shared infrastructure instead. Using Model Context Protocol (MCP) allows you to do just that.
In this article, I'll dive more into why agents coordinating through shared infrastructure beats direct communication, and provide working Python code you can use for the supervisor pattern. I'll also explain the difference between MCP and A2A, and real world security concerns you should keep in mind before things hit production.
Shared infrastructure vs direct communication: The hospital ER analogy
One way to think about shared infrastructure is how a hospital emergency room operates. The triage nurse does not personally coordinate with every specialist. That would take far too long! Instead, there's shared infrastructure:
- The patient chart shows symptoms a vitals.
- The hospital information system tracks bed availability.
- The lab system holds test results accessible to any authorized provider.
Each specialist focuses on their domain. The cardiologist handles heart issues, the orthopedist deals with fractures, the radiologist reads scans. They do not constantly talk to each other about each patient. Instead, they coordinate through the shared systems: reading charts, ordering tests, updating patient status, and so on.
The result? You get coordinated care without the chaos of every specialist calling every other specialist.
MCP works the same way for AI agents. It provides the shared infrastructure that lets specialized agents coordinate without direct communication. In the academic literature, Wooldridge calls this "coordination through the environment" where agents interact indirectly by modifying shared state rather than sending explicit messages. The pattern reduces coupling and allows agents to remain autonomous while still achieving collective goals.
What Model Context Protocol (MCP) actually does
Before diving into implementation, let me clarify something that many tutorials get wrong. MCP is not a protocol for agents to talk to each other. It is a protocol for agents to access tools and data.
MCP handles vertical integration. It connects agents to tools, databases, and APIs. Think of it as giving each agent arms to reach into external systems. When your triage agent needs to check a customer's history, it calls an MCP tool connected to your CRM. When your billing agent needs to process a refund, it writes to an MCP resource connected to Stripe. The agent reaches down into the infrastructure layer.
A2A handles horizontal integration. It lets agents talk directly to each other. Google released A2A in April 2025 specifically because MCP does not solve agent-to-agent communication. When your billing agent needs to negotiate with your escalation agent about whether to offer a discount, that is a horizontal conversation. When two agents need to debate or reach consensus, that is A2A territory.
The distinction matters because it shapes how you architect your system. You are not building a chat room for agents, you are building shared workspaces they all access. Most multi-agent coordination does not require direct agent communication at all.
When to use MCP alone for your multi-agent system
For the majority of multi-agent systems, MCP by itself is sufficient. Agents coordinate through shared state rather than direct messages. This is the pattern Block uses for Goose, the pattern Microsoft uses in Azure AI Agent Service, and the pattern that scales to thousands of users.
Here is how it works in practice. Your triage agent categorizes a ticket and writes the result to shared state. Your billing agent reads from that same state to see if the ticket involves payment issues. No direct communication happened, but the agents coordinated. The billing agent knows what triage determined because they both access the same shared infrastructure.
# Triage agent categorizes ticket and writes to shared state
await mcp_client.call_tool("save_ticket", {
"ticket_id": ticket_id,
"category": "billing",
"priority": "high"
})
# Billing agent reads from shared state (no direct communication)
ticket = await mcp_client.call_tool("get_ticket", {"ticket_id": ticket_id})
resolution = await process_billing_issue(ticket)
await mcp_client.call_tool("save_resolution", {"ticket_id": ticket_id, "resolution": resolution})
# Escalation agent reads ticket (still no direct communication)
ticket = await mcp_client.call_tool("get_ticket", {"ticket_id": ticket_id})
if ticket["priority"] == "high":
await mcp_client.call_tool("escalate", {"ticket_id": ticket_id})
The agents never exchange messages directly, but instead coordinate through the shared MCP layer. This is exactly how microservices coordinate through a shared database rather than constant RPC calls.The pattern reduces coupling and makes the system easier to debug, test, and scale.
When to add A2A to your solution
Sometimes you genuinely need agents to talk to each other, such as:
- Negotiation scenarios where two agents need to reach agreement.
- Debate scenarios where agents argue different positions.
- Delegation scenarios where one agent dynamically assigns tasks to another based on real-time conversation.
Google's A2A protocol handles these cases. It defines how agents discover each other, exchange messages, and maintain conversation state. The key primitives are Agent Cards for discovery, Tasks for work units, and Messages for communication.
# A2A: Agent discovery via Agent Card
agent_card = {
"name": "EscalationAgent",
"description": "Handles complex cases requiring approval",
"capabilities": ["discount_approval", "refund_override", "supervisor_alert"],
"endpoint": "https://agents.example.com/escalation"
}
# A2A: Direct agent-to-agent message
message = {
"from": "BillingAgent",
"to": "EscalationAgent",
"task_id": "ticket-789",
"content": "Customer requesting $500 refund outside policy. 3-year customer, $12K lifetime value. Recommend approval?",
"reply_to": "https://agents.example.com/billing/inbox"
}
# A2A: Task delegation
task = {
"id": "approval-456",
"delegated_by": "BillingAgent",
"assigned_to": "EscalationAgent",
"description": "Approve or deny exception refund request",
"deadline": "2025-01-21T15:00:00Z"
}
The pattern is different from MCP. With A2A, agents maintain ongoing conversations. They can ask follow-up questions, negotiate, disagree and resolve conflicts. This is powerful, but adds complexity.
The Complete Stack: MCP + A2A Together
For production multi-agent systems, you often want both. MCP provides the shared infrastructure layer, while A2A provides the communication layer when direct agent dialogue is necessary.
Google's documentation puts it simply: "Build with ADK or any framework, equip with MCP or any tool, and communicate with A2A." The protocols are complementary, not competing.
Here is what a complete architecture looks like:
class MultiAgentSystem:
def __init__(self):
# MCP for shared infrastructure
self.mcp_client = MCPClient("support-pipeline")
# A2A for agent communication when needed
self.a2a_client = A2AClient()
self.agent_registry = {}
async def register_agent(self, agent_card):
"""Register agent for A2A discovery."""
self.agent_registry[agent_card["name"]] = agent_card
async def coordinate_via_mcp(self, task):
"""Most coordination happens through shared state."""
# Write task to shared state
await self.mcp_client.call_tool("create_task", task)
# Agents pick up tasks from shared queue
# No direct communication needed
async def negotiate_via_a2a(self, agent_a, agent_b, topic):
"""Use A2A when agents need to discuss/negotiate."""
conversation = await self.a2a_client.start_conversation(
participants=[agent_a, agent_b],
topic=topic
)
# Agents exchange messages until consensus
while not conversation.resolved:
response = await conversation.next_message()
if response.type == "agreement":
return response.outcome
return conversation.final_state
async def run_pipeline(self, ticket):
"""Hybrid approach: MCP for coordination, A2A for negotiation."""
# Step 1: Triage (MCP only)
await self.coordinate_via_mcp({
"type": "triage",
"ticket": ticket
})
# Step 2: If refund exceeds limit, agents negotiate (A2A)
ticket_state = await self.mcp_client.call_tool("get_ticket", {"ticket_id": ticket["id"]})
if ticket_state["refund_amount"] > 200:
approval = await self.negotiate_via_a2a(
"BillingAgent",
"EscalationAgent",
"Should we approve this exception?"
)
ticket_state["approval"] = approval
# Step 3: Resolution (MCP only)
await self.coordinate_via_mcp({
"type": "resolve",
"ticket": ticket_state
})
return await self.mcp_client.call_tool("get_resolution", {"ticket_id": ticket["id"]})
The key insight is that most coordination happens through MCP. You only bring in A2A when you need actual agent dialogue. This keeps the system simple while preserving flexibility for complex scenarios.
Deciding between MCP and A2A: When to use each
When should you use which protocol? The decision comes down to whether agents need to have a conversation or just share information. Here is a practical breakdown.
When to use MCP alone
- When agents perform sequential tasks where one agent's output becomes another agent's input
- When you need parallel execution where multiple agents work independently on shared resources
- For status tracking where agents report progress to a central state
- When the workflow is predictable and agents do not need to adapt based on what other agents say
When to add A2A
- When agents need to negotiate and reach agreement on terms, prices, or approaches.
- When you want agents to debate and argue different positions before deciding.
- For dynamic delegation where one agent discovers another's capabilities at runtime and assigns tasks accordingly.
- When the conversation matters and the back-and-forth dialogue is part of the value.
The customer support system we have been building is a good example of MCP-only coordination. The triage agent does not need to debate with the billing agent, it just needs to categorize the ticket. The billing agent does not need to negotiate with the sentiment agent, it just needs to know if the customer is frustrated. Shared state handles everything.
But imagine a complex escalation where the billing agent needs approval for an exception refund. The escalation agent might ask clarifying questions: "What is the customer's history? Have we made exceptions before? What is the revenue impact?" That back-and-forth requires A2A because the negotiation dialogue itself creates value.
Or consider a dispute resolution system where agents genuinely disagree about whether a chargeback is valid. One agent argues for the customer, another for fraud prevention. They need to debate, present evidence, and reach consensus. MCP cannot model this because it requires back-and-forth reasoning between agents.
AgentMaster, released in July 2025, was the first framework to use both protocols together. It uses A2A for inter-agent messaging and MCP for resource access. This hybrid pattern is likely the future for production multi-agent systems that need both coordination and communication.
The supervisor pattern
The most common pattern for multi-agent MCP systems is the supervisor pattern. One orchestrator agent delegates tasks to specialized worker agents. All of them access shared MCP tools.
This mirrors what distributed systems researchers call the coordinator pattern. A central component manages workflow while workers handle specialized tasks independently. The coordinator does not do the work itself, it manages state and delegates appropriately.
Here is what this looks like in practice:
from mcp import Server
from mcp.types import Tool, Resource
server = Server("support-pipeline")
ticket_state = {
"category": None,
"resolution": None,
"escalated": False,
"status": "pending"
}
@server.tool()
async def save_ticket(ticket_id: str, category: str, priority: str) -> str:
"""Save ticket categorization for other agents to access."""
ticket_state["category"] = category
ticket_state["priority"] = priority
ticket_state["status"] = "triaged"
return "Ticket categorized successfully"
@server.tool()
async def get_ticket(ticket_id: str) -> dict:
"""Retrieve ticket state."""
return ticket_state
@server.tool()
async def save_resolution(ticket_id: str, resolution: str) -> str:
"""Save ticket resolution."""
ticket_state["resolution"] = resolution
ticket_state["status"] = "resolved"
return "Resolution saved successfully"
@server.tool()
async def escalate(ticket_id: str) -> str:
"""Escalate ticket to supervisor."""
ticket_state["escalated"] = True
ticket_state["status"] = "escalated"
return "Ticket escalated successfully"
@server.tool()
async def get_status() -> dict:
"""Check pipeline status."""
return ticket_state
Each agent connects to this same MCP server. The triage agent calls save_ticket(). The billing agent calls get_ticket() and save_resolution(). The escalation agent calls get_ticket() and escalate(). They never communicate directly. They coordinate through the shared state.
If you want to go deeper on implementing these orchestration patterns in production, including error handling, retries, and scaling strategies, we cover it in the Model Context Protocol learning path on Pluralsight.
Parallel Execution
The supervisor pattern is sequential, but MCP also enables parallel execution. The November 2025 MCP specification explicitly added support for parallel tool calls and server-side agent loops (MCP Specification, 2025). Concurrent execution is now first-class in the protocol.
Multiple agents can work simultaneously, each accessing the shared infrastructure without stepping on each other's toes. Consider a support system where multiple analysis agents examine a ticket at the same time:
import asyncio
async def parallel_analysis(ticket_id: str):
"""Run multiple analysis agents in parallel."""
tasks = [
sentiment_agent.run("Analyze customer sentiment"),
category_agent.run("Determine ticket category"),
priority_agent.run("Assess ticket priority"),
history_agent.run("Check customer history")
]
# All agents write to shared MCP resources
results = await asyncio.gather(*tasks)
# Router reads all findings
all_analysis = await mcp_client.call_tool("get_all_analysis")
return route_to_specialist(all_analysis)
Each agent writes to its own namespace in the shared MCP server. A final aggregator reads all of them. No agent needs to know about any other agent. They just know about the shared tools. This decoupling is what makes the pattern scale.
The security reality with MCP
Here is something many MCP tutorials gloss over: security remains a significant challenge. The protocol initially prioritized interoperability over security, and production deployments need to address this gap explicitly.
Real vulnerabilities exist. For example:
- Tool poisoning attacks can hide malicious instructions in tool descriptions that are invisible to humans but understood by AI agents.
- Authentication gaps mean the protocol is still catching up on enterprise-grade identity management.
- Context injection attacks can manipulate agents processing untrusted input into misusing their tool access.
OWASP released "Enterprise-Grade Security for the Model Context Protocol" guidelines in 2025 outlining essential practices for production deployments.
server = Server(
"secure-pipeline",
auth=OAuth21Provider(
issuer="https://your-idp.com",
audience="mcp-agents"
)
)
@server.tool()
async def sensitive_operation(data: str) -> str:
"""Tool with explicit authorization checks."""
# Verify agent has permission
if not await verify_agent_permissions(context.agent_id):
raise PermissionError("Agent not authorized")
# Sanitize input to prevent injection
sanitized = sanitize_input(data)
# Log for audit trail
await audit_log.record(
agent=context.agent_id,
tool="sensitive_operation",
input_hash=hash(sanitized)
)
return await execute_safely(sanitized)
For production systems, you need things like:
- Tool-level authorization where users approve each client-tool pair
- OAuth 2.1 for authentication
- Audit logging for traceability
- Input validation to prevent prompt injection
- Sandboxed environments for tool execution.
Real-world validation of MCP architecture
This is not theoretical. Block, the company behind Square and Cash App, built their internal AI system Goose entirely on MCP architecture. Over a thousand engineers, designers, and support staff use MCP-powered agents daily. Their approach is instructive: they built all MCP servers in-house for security control. They kept agents short-lived and focused on single tasks rather than long-running processes. And they ensured coordination happens through shared MCP resources rather than agent-to-agent messaging. The result is a system that scales across the organization without the maintenance burden of point-to-point connections.
Quantium deployed MCP-enabled agents across 1,200+ team members for analytics workflows. Their agents handle data preparation, analysis, and visualization as a coordinated pipeline. Each agent specializes in one task and coordinates through shared state. Microsoft incorporated MCP into Azure AI Agent Service, making it available to enterprise customers building agent systems at scale. Wiley Publishing uses MCP for integrating peer-reviewed content with AI tools, letting agents access their entire catalog through standardized interfaces.
The adoption numbers tell the story. MCP has grown to 97 million monthly SDK downloads and adoption by OpenAI, Google, Microsoft, and AWS, all within its first year. In December 2025, Anthropic donated MCP to the Linux Foundation's Agentic AI Foundation, signaling its transition from experimental protocol to industry standard. Jensen Huang of NVIDIA called it work that "has completely revolutionized the AI landscape."
A2A adoption is earlier but growing. Google launched it with support from over 50 partners including Atlassian, Salesforce, SAP, ServiceNow, and MongoDB. The agent development kit (ADK) for A2A reached general availability in late 2025. Companies building systems that require negotiation or dynamic agent collaboration are adopting it alongside MCP.
The key to effective multi-agent systems is designing appropriate coordination mechanisms that allow agents to work together without requiring each agent to model every other agent's behavior. MCP provides exactly this: a coordination mechanism through shared infrastructure rather than explicit agent-to-agent communication. A2A adds the explicit communication layer when implicit coordination is not enough.
What you can do now
Multi-agent systems do not require agents to constantly message each other. Like specialists in a hospital ER, they can coordinate through shared infrastructure. This is the fundamental insight that separates scalable multi-agent architectures from unmaintainable tangles of point-to-point connections.
Here is what to take away. MCP provides the shared layer. Think of it as the database that multiple microservices share, not the messaging queue they communicate through. The supervisor pattern keeps coordination logic centralized while execution remains distributed. Parallel execution is first-class in the November 2025 spec with server-side agent loops and parallel tool calls. Security requires explicit attention since tool poisoning and injection attacks are real. And when you genuinely need peer-to-peer agent dialogue, A2A complements MCP as the horizontal integration layer.
Most organizations building multi-agent systems are still at what practitioners call Level 1-2 maturity: basic tool wrapping and simple context management. The patterns in this article point toward Level 4: multi-agent coordination with dedicated context servers and sophisticated orchestration.
The next time you are building a multi-agent system, resist the urge to create point-to-point connections between agents, and build shared infrastructure instead.
Want to learn more about Model Context Protocol (MCP)? Check out Pluralsight's MCP learning path, which covers everything from the fundamentals of MCP terminology and architecture, to hands-on practice building integrations, to advanced features like custom workflow servers and transport optimizations.
Advance your tech skills today
Access courses on AI, cloud, data, security, and more—all led by industry experts.