If you’ve built anything with AI agents recently, you’ve heard two acronyms thrown around constantly, often with wildly conflicting takes: “MCP is the USB-C of AI.” “A2A replaces MCP.” “You need both.” “Neither is production-ready.”
Most of that noise comes from a single confusion. MCP and A2A solve completely different problems, and treating them as competitors — or worse, as interchangeable — is one of the most common and most expensive architecture mistakes in the space right now. Get the distinction wrong and your system fights you at every layer: you’ll find yourself wrapping databases as “agents,” building bespoke RPC glue between services that should just speak a standard, or reaching for the heavier protocol when the lighter one was the right call.
Here’s the entire thesis of this series in two sentences:
MCP handles how an agent talks to tools. A2A handles how agents talk to each other.
Everything else follows from that. Let’s make it precise.
Two problems that look similar and aren’t
Before either protocol existed, two distinct pain points kept showing up in agentic systems.
Problem 1 — tool integration. A single agent needs to do things in the world: query a database, hit an API, read a file, send an invoice. Every integration was a bespoke one-off. If you had M agent applications and N tools, you were maintaining M×N custom integrations, each inventing its own auth, schema, and error handling. This is a vertical problem — it’s about one agent reaching down to capabilities.
Problem 2 — agent coordination. As teams built multi-agent systems, there was no standard way for agents to discover each other, advertise what they could do, or hand off work. If you had a research agent and a writing agent, wiring them together meant gluing orchestration logic into application code. Swap one agent for another and you rewrote the integration. This is a horizontal problem — it’s about agents reaching across to peers.
MCP is the purpose-built answer to Problem 1. A2A is the purpose-built answer to Problem 2. Understanding that they sit on different axes is the key to everything else.

MCP: the vertical protocol
The Model Context Protocol (MCP), created by Anthropic and released in November 2024, standardizes how an AI agent connects to external tools, data, and services. Think of it as a universal connector — the analogy people reach for is USB-C: one standard plug instead of a drawer full of proprietary cables.
Concretely, an MCP server exposes capabilities through three primitives — Tools (actions the model can invoke), Resources (read-only data it can fetch), and Prompts (reusable templates) — and any MCP-compatible host can discover and use them without bespoke integration code. The M×N integration explosion collapses into M+N: build a server once, and every MCP-aware host can use it.
MCP is, at its core, synchronous and request-response: the agent asks the tool to do something and gets an answer back, usually in the same breath. (An experimental Tasks primitive for longer-running work arrived in the late-2025 spec, but the model’s center of gravity is still fast, synchronous tool calls.)
A2A: the horizontal protocol
The Agent2Agent protocol (A2A), created by Google and released in April 2025 with 50+ launch partners, standardizes how independent agents — built by different teams, on different frameworks, hosted on different networks — discover each other, delegate tasks, and exchange results.
Its defining trait is the opposite of MCP’s. A2A is intentionally stateful and asynchronous: it assumes tasks can be long-running, multi-step, and human-in-the-loop from the start. A client agent delegates work to a remote agent, and — crucially — the remote agent does that work without access to the client’s internal context, memory, or tools. That opacity is a feature: it respects the autonomy and privacy boundaries you need when one company’s agent calls another’s. The analogy here is HTTP: it doesn’t care whether the other side runs Rails, Django, or Go; it just defines the shape of the conversation.
They are designed to compose, not compete
The single most important takeaway: these protocols are complementary. Most real production multi-agent systems use both — A2A to coordinate between agents, and MCP so each agent can reach its own tools and data. They live on perpendicular axes, so they never actually overlap:
| MCP | A2A | |
|---|---|---|
| Question it answers | How does my agent use a tool? | How does my agent talk to another agent? |
| Axis | Vertical (agent → tools/data) | Horizontal (agent ↔ agents) |
| Relationship | Agent → capability | Peer ↔ peer |
| Interaction style | Synchronous request/response | Stateful, async, long-running |
| Context sharing | Tool runs inside the agent’s context | Remote agent has no access to caller’s context |
| Created by | Anthropic (Nov 2024) | Google (Apr 2025) |
| On the wire | JSON-RPC 2.0 over stdio / Streamable HTTP | JSON-RPC 2.0 over HTTP(S) + SSE |
| Unit of work | A tool call | A Task (with a lifecycle) |
Both are now neutral, governed standards
A fair worry in 2025 was vendor lock-in: MCP was “an Anthropic thing,” A2A “a Google thing.” That’s no longer the situation, and it matters for adoption decisions.
Both protocols now live under the Linux Foundation’s Agentic AI Foundation (AAIF). Google contributed A2A to the Linux Foundation in mid-2025, and Anthropic donated MCP in December 2025. Both have crossed the production-maturity threshold: MCP’s current spec revision is the November 2025 release, with 10,000+ public servers and native support across every major model provider; A2A reached a stable v1.0 in early 2026, with signed Agent Cards, SDKs in five languages, and 150+ production organizations. Neither is a moving target anymore, and neither is owned by a single vendor.
When you’re staring at a design decision, ask one question:
Is the thing on the other end a capability, or is it a peer?
- A database, an API, a file system, a payment processor, a search index — those are capabilities. They reach down. Use MCP.
- Another autonomous agent — owned by a different team or vendor, running its own logic, that you want to delegate a goal to and get a result back from — that’s a peer. It reaches across. Use A2A.
And when you have a fleet of agents that each need their own tools? You use both, on their respective axes.
MCP in Practice — How an Agent Talks to Tools
MCP gives a model a standardized contract for reaching the outside world. Instead of every host hand-coding every integration, a server declares what it can do, describes it with schemas, and any MCP-compatible host can discover and use it.
The architecture: host, client, server
MCP has three roles, and keeping them straight prevents most confusion:
- Host — the AI application the user interacts with (Claude Desktop, Claude Code, Cursor, your own app). The host owns the model and the conversation.
- Client — a connector the host instantiates, one per server, that manages a single stateful session. A host talking to three servers runs three clients.
- Server — a program that exposes capabilities. It can run locally (a subprocess on your machine) or remotely (a hosted HTTPS service).
Underneath, MCP is JSON-RPC 2.0 split into two layers: a data layer (the message schema and primitives) and a transport layer (how those messages travel). That separation is why the same primitives work identically whether the server is a local subprocess or a remote service.

The three server primitives
Everything a server offers falls into exactly three buckets. Getting the mapping right is most of good MCP design.
Tools are functions the model can invoke to perform actions — they have side effects, like POST endpoints. Discovered via tools/list, executed via tools/call.
Resources are read-only data the model can fetch for context — no side effects, like GET endpoints. Discovered via resources/list, read via resources/read.
Prompts are reusable templates that structure a recurring interaction. Discovered via prompts/list, retrieved via prompts/get. (In practice this is the least-used primitive — most teams keep prompts in host-side code — but it’s the right tool for domain-specific servers that want to enforce a house workflow.)
A clean way to remember it: Tools do, Resources show, Prompts guide.
Here’s a complete small server using FastMCP (the standard Python library, 3.x as of 2026). It models a tiny CRM:
from fastmcp import FastMCP
mcp = FastMCP("crm-server")
# TOOL — an action with side effects (the model can invoke it)
@mcp.tool()
def create_lead(name: str, email: str, source: str = "web") -> dict:
"""Create a new sales lead and return its record."""
lead = db.insert("leads", {"name": name, "email": email, "source": source})
return {"id": lead.id, "status": "created"}
# RESOURCE — read-only context (the model can fetch it, no side effects)
@mcp.resource("crm://customers/{customer_id}")
def customer_record(customer_id: str) -> str:
"""Return a customer's record as JSON."""
return json.dumps(db.get("customers", customer_id))
# PROMPT — a reusable template that guides a workflow
@mcp.prompt()
def qualify_lead(lead_id: str) -> str:
"""Guide the model through qualifying a lead."""
return (f"Review lead {lead_id}. Assess fit on budget, authority, "
f"need, and timeline. Recommend next action.")
if __name__ == "__main__":
mcp.run(transport="stdio") # local; swap to "streamable-http" to host remotely
Notice what you didn’t write: no JSON-RPC plumbing, no schema by hand. The SDK infers each tool’s input schema from the function signature and advertises name, description, and schema to any client that connects. That’s the whole point — the contract is generated from ordinary typed Python.
The two transports
MCP deliberately supports only two transports, and the choice is about where the server runs, not what it does.
stdio — the host launches the server as a subprocess and exchanges JSON-RPC frames over standard in/out. No network, no ports, OS-level identity for security. This is the default for local tools — it’s how Claude Desktop and Claude Code run filesystem and git servers.
Streamable HTTP — the server is a network endpoint: a single HTTPS URL that accepts HTTP POST for requests, with optional Server-Sent Events for streaming. This is the transport for remote, multi-user, or cloud-hosted servers. It was introduced in the 2025-03-26 spec and replaced the older HTTP+SSE transport, which is being sunset across major providers through 2026 — so if you see two-endpoint HTTP+SSE examples online, they’re stale.
A practical rule: local dev tool → stdio; anything multi-tenant or hosted → Streamable HTTP behind OAuth 2.1, which is the recommended auth for remote servers. If you’re deploying serverless, prefer a stateless server (omit the Mcp-Session-Id header) so it scales horizontally without sticky sessions — stateless operation is the headline item on MCP’s 2026 roadmap.
The reverse primitives: what makes MCP interactive
Most people think of MCP as one-directional — host calls server. The feature that surprises people is that servers can call back to the host. These client-side primitives are what make MCP interactive rather than a static function catalog:
- Sampling (
sampling/createMessage) — the server asks the host to run an LLM completion on its behalf. Why? So an agentic server can reason without bundling its own model SDK or API key. A GitHub server that needs to classify an issue’s severity can ask the host to run that classification with whatever model the user is already paying for. The server stays model-independent; the user owns the billing and the approval. - Elicitation — the server asks the user, through the host’s UI, for missing input or confirmation. A payments server can surface “Is the user OK paying $4.99 for this?” right in the chat thread.
- Roots — the host advertises trusted filesystem boundaries to the server.
These reverse calls always route through the host, which must show the user what’s being asked and let them edit or reject it. That consent boundary is core to MCP’s security model.
Connecting from a host
On the host side, the lifecycle is: connect → discover → invoke. Conceptually:
# Pseudocode for the host/client side
session = connect("stdio", command=["python", "crm_server.py"])
await session.initialize() # capability handshake
tools = await session.list_tools() # tools/list — dynamic discovery
result = await session.call_tool( # tools/call
"create_lead", {"name": "Ada", "email": "ada@example.com"})
schema = await session.read_resource("crm://customers/42") # resources/read
Discovery is dynamic: the host calls tools/list at runtime, so a server can add or remove tools and notify the host with a notifications/tools/list_changed message rather than requiring a redeploy. From there, when the model decides to act, the host issues tools/call with arguments matching the advertised schema, gets structured output back, and feeds it to the model. That loop — discover, decide, call, observe — is the entirety of an agent using a tool over MCP.
Where MCP stops
Notice what we still didn’t see: we never had one agent ask another agent to accomplish a goal. Every interaction was an agent reaching down to a capability it controls, inside its own context. The CRM server isn’t an autonomous peer — it’s a typed set of actions and data, and the agent is fully in charge.
The moment you want to hand a goal to an independent agent — one you don’t control, possibly run by another team or vendor, that will work asynchronously and report back without sharing your context — MCP is the wrong tool. That’s a horizontal, peer-to-peer interaction, and it’s exactly what A2A is built for.
A2A in Practice & When to Use Which
The four core concepts
A2A is built on four objects. Learn these and the protocol falls into place:
- Agent Card — a public JSON document, served at
/.well-known/agent.json, that describes an agent: its name, version, endpoint URL, the skills it offers, the input/output formats it accepts, and how to authenticate. It’s a machine-readable business card. A client fetches it (directly or via a registry) to learn who can do what and how to connect. - Task — the fundamental unit of work, identified by a unique
taskId. Unlike an MCP tool call, a Task has an explicit, trackable lifecycle and can run for seconds or days. - Message — the unit of exchange inside a Task. Each message has a role (
useroragent) and a body that’s an array of Parts — which can mix text, files, binary, and structured data, making A2A multimodal by design. - Artifact — the output of a Task: a PDF invoice, a JSON analysis, an image. Delivered to the client when the work completes.
The interaction model is client agent → remote agent. The client delegates; the remote agent executes autonomously and returns results without ever seeing the client’s internal context, memory, or tools. That isolation is the whole point of the horizontal axis — it’s what lets agents from different vendors collaborate without trusting each other with their internals.
Discovery: the Agent Card
An agent advertises itself with a card like this, served at a well-known URL:
{
"name": "research-agent",
"description": "Performs literature reviews and synthesizes findings.",
"url": "https://research.example.com/a2a",
"version": "1.0.0",
"capabilities": { "streaming": true, "pushNotifications": true },
"defaultInputModes": ["text/plain"],
"defaultOutputModes": ["application/json", "text/markdown"],
"securitySchemes": {
"oauth2": { "type": "oauth2", "flows": { "clientCredentials": { "...": "..." } } }
},
"skills": [
{
"id": "literature-review",
"name": "Literature Review",
"description": "Given a topic, returns a synthesized review with citations.",
"inputModes": ["text/plain"],
"outputModes": ["text/markdown"]
}
]
}
A client fetches this card to decide whether this agent can help and how to call it. A2A v1.0 (early 2026) added Signed Agent Cards — a cryptographic signature proving the card was issued by the domain owner. Without it, an attacker could stand up a forged card and redirect callers to a malicious endpoint (a “card forgery” attack). For any cross-organization deployment, verify the signature.
Delegating a task
With the card in hand, the client sends a message via JSON-RPC over HTTPS. The current method is message/send (for synchronous or pollable work) or message/stream (for streaming updates over SSE):
{
"jsonrpc": "2.0",
"id": 1,
"method": "message/send",
"params": {
"message": {
"role": "user",
"parts": [
{ "kind": "text", "text": "Review recent work on retrieval-augmented generation." }
],
"messageId": "9c1f...e7"
}
}
}
The remote agent responds with a Task object carrying a status. Because the work may be long-running, the client tracks progress through the task lifecycle rather than blocking on a single response. Using the Python a2a-sdk, the same exchange is roughly:
from a2a.client import A2AClient
# 1. Discover: fetch and (in production) verify the signed Agent Card
client = await A2AClient.from_agent_card_url(
"https://research.example.com/.well-known/agent.json",
auth=oauth_credentials,
)
# 2. Delegate: hand over a goal, not a function call
task = await client.send_message(
"Review recent work on retrieval-augmented generation."
)
# 3. Track: poll or stream until a terminal state
while task.status.state not in ("completed", "failed", "canceled", "rejected"):
task = await client.get_task(task.id) # tasks/get
await asyncio.sleep(2)
# 4. Collect the Artifact
if task.status.state == "completed":
review = task.artifacts[0] # e.g. markdown review
For real-time progress you’d use message/stream and consume SSE events instead of polling; for fire-and-forget work you’d register a push-notification webhook so the remote agent calls you back when it’s done.
The task lifecycle
The thing that most distinguishes A2A from MCP is that work is stateful. Every Task moves through an explicit lifecycle, and long-running, human-in-the-loop, and async patterns are first-class:

A few consequences worth internalizing. A task can pause at input-required (the remote agent needs more from you) or auth-required (it needs credentials) and resume — this is how human-in-the-loop works natively. The four terminal states (completed, failed, canceled, rejected) are final: a terminal task can’t be restarted; you start a new one. And retry and circuit-breaking are deliberately left to the client — the protocol governs state and messaging, not your resilience policy.
How the two protocols compose
Here’s where the “they’re complementary” claim becomes concrete. A2A and MCP operate on perpendicular axes, so a single agent uses both at once: it speaks A2A horizontally to its peers and MCP vertically to its own tools.
Picture an orchestrator that delegates a literature review to the research agent above. That research agent, internally, is itself an MCP host: to actually do the review, it calls an arXiv search tool, a PDF-reader tool, and a vector-store resource — all over MCP. The orchestrator never sees any of that. It handed over a goal via A2A; the tools the remote agent used to satisfy it are private, on the vertical axis, exactly where MCP belongs.

That picture is the answer to “do I need both?” In any non-trivial multi-agent system: yes, and they never collide, because one runs across and the other runs down.
When to use which: the decision framework
Carry one question into every design decision: is the thing on the other end a capability, or a peer?
Reach for MCP when:
- Your agent needs to call an API, query a database, read files, or trigger an action you control.
- The interaction is synchronous — ask, get an answer, continue.
- The “other side” has no agency of its own; it’s a typed set of actions and data.
- You want the tool to operate inside your agent’s context.
Reach for A2A when:
- You’re delegating a goal to an autonomous agent, not calling a function.
- That agent is owned by a different team, vendor, or codebase, and you shouldn’t share your internal context with it.
- The work is long-running, async, or needs human-in-the-loop checkpoints.
- You want to swap one agent for another without rewriting orchestration — discovery via Agent Cards gives you that.
Use both whenever you have multiple agents that each need their own tools — which is most production systems. Coordinate across with A2A; equip each agent with MCP.
The mistakes this series exists to prevent
Each of these is a real, common error, and each is just the capability/peer question answered wrong:
- Exposing a database or API as an “agent” over A2A. It’s a capability, not a peer. That’s an MCP server. You’ve added a coordination protocol, an Agent Card, and a task lifecycle to something that should be a synchronous tool call.
- Hand-rolling bespoke RPC between heterogeneous agents instead of using A2A. You’ll reinvent discovery, auth, and task state — badly — and re-glue it every time an agent changes.
- Wrapping every tool as an agent because A2A feels newer or more powerful. Over-engineering: you pay statefulness and discovery overhead for a function call.
- Trying to use MCP to coordinate independent agents across a trust boundary. MCP has no concept of an autonomous peer working without your context; you’ll end up leaking context or faking agency.
The protocols are not rivals and the choice is rarely ambiguous once you ask the right question. MCP is how your agent talks to tools. A2A is how your agents talk to each other. Build on the correct axis and the architecture stops fighting you.