The instinct when you first see agent frameworks is to build one agent that does everything. That pattern breaks the moment the operation has real complexity. Email QA is a good example. Between drafting and sending, you want multiple specialised checks — tone, deliverability, DNS health — and each of those benefits from its own agent with its own tools and context. A2A is what lets those agents delegate work to each other without tying them together in a single codebase.
The pattern
The shape of the pipeline is simple: four agents, each doing one thing, connected by A2A task exchanges. A coordinator agent decides the order and the exit conditions. Each specialist exposes an Agent Card, advertises one or two capabilities, and handles its domain in isolation.
The benefit is not just separation of concerns. It is that each agent can live on different infrastructure, be built by different teams, and be swapped for an alternative without touching the others. The reviewer agent could be built in LangGraph, the deliverability agent hosted at our service, the DNS agent a tiny thing someone built over a weekend. They cooperate because they agree on the A2A envelope.
Architecture
Described in prose because the most useful thing is the call graph. A send request enters from a human (a marketing operator) or an upstream system (a CRM triggering a sequence). The request hits the coordinator agent. The coordinator reads the target, pulls the draft, and walks through a decision tree:
- Delegate to reviewer agent for tone, compliance, and personalisation quality.
- If reviewer approves, delegate to DNS agent to verify the sending domain is authentically configured today (records can drift).
- If DNS is healthy, delegate to deliverability agent (us) for a placement verdict.
- If placement is above threshold, hand to sender agent to actually dispatch the mail.
Each step either approves, fails, or emits a revision request. Failures propagate back to the coordinator, which decides whether to retry, escalate to human, or abort.
You could. People do. The downside is that each check ends up coupled to the caller's runtime, and swapping one for another becomes a code change. A2A pushes the contract to the network edge, so each agent is independently deployable.
Agent roles
Reviewer agent
Reads the draft HTML and subject. Checks for tone, compliance (unsubscribe links, accurate sender identity, GDPR statements), and personalisation token correctness. Capability IDs: review-template, rewrite-template.
Deliverability agent (us)
Runs the placement test, returns per-provider inbox/spam/missing counts. Capabilities: inbox-placement-test, dns-audit, blacklist-check.
DNS agent
A narrow, fast agent focused on live DNS state. Used to cross-check before committing to a send — records can drift between the time a template was written and the time it goes out. Capabilities: check-spf, check-dkim, check-dmarc, check-bimi.
Sender agent
The final hop: dispatches approved mail. Holds provider credentials and throttling state. Capability: send-email.
Task-exchange schema
Every task between these agents uses the same A2A envelope. That uniformity is what makes the pipeline observable — the coordinator can log every envelope with the same structure, and a human debugging later can trace a send end-to-end by task ID.
// A2A task envelope, used at every hop
{
"taskId": "task_<id>",
"capability": "<capability-id>",
"requestedBy": { "agentName": "...", "agentUrl": "https://.../.well-known/agent.json" },
"input": { /* capability-specific */ },
"callbackUrl": "https://coordinator.example/a2a/callbacks/task_<id>" // for async
}Complete walkthrough
A real send traced through the pipeline. The draft: a cold email from outreach@news.acme.com to a list of 500 B2B prospects. Entry point: a nightly cron in the CRM fires the coordinator agent.
Hop 1: coordinator → reviewer
POST https://reviewer.acme.ai/a2a/tasks
Authorization: Bearer rev_live_...
Content-Type: application/json
{
"taskId": "task_01",
"capability": "review-template",
"requestedBy": { "agentName": "Coordinator", "agentUrl": "https://coord.acme.ai/.well-known/agent.json" },
"input": {
"subject": "Quick question about your payroll stack",
"html": "<html>...body...</html>",
"audience": "b2b-operations-500"
}
}Reviewer returns inline (sync mode, about 8 seconds for an LLM pass):
{
"taskId": "task_01",
"status": "completed",
"result": {
"verdict": "approved",
"notes": "Tone OK. Unsubscribe present. Personalisation tokens render correctly.",
"score": 0.87
}
}Hop 2: coordinator → DNS agent
POST https://dns-agent.acme.ai/a2a/tasks
{
"taskId": "task_02",
"capability": "check-dmarc",
"input": { "domain": "news.acme.com" }
}{
"taskId": "task_02",
"status": "completed",
"result": {
"policy": "quarantine",
"alignment": "strict",
"rua": "dmarc@acme.com",
"verdict": "healthy"
}
}Hop 3: coordinator → deliverability agent (us)
POST https://check.live-direct-marketing.online/a2a/tasks
Authorization: Bearer ic_live_...
Content-Type: application/json
{
"taskId": "task_03",
"capability": "inbox-placement-test",
"requestedBy": { "agentName": "Coordinator", "agentUrl": "https://coord.acme.ai/.well-known/agent.json" },
"input": {
"from": "outreach@news.acme.com",
"subject": "Quick question about your payroll stack",
"html": "<html>...body...</html>",
"providers": ["gmail", "outlook", "yahoo", "mailru", "yandex", "gmx"]
},
"callbackUrl": "https://coord.acme.ai/a2a/callbacks/task_03"
}We respond 202 Accepted with an ETA. Two minutes later the coordinator's callback receives:
POST https://coord.acme.ai/a2a/callbacks/task_03
{
"taskId": "task_03",
"status": "completed",
"result": {
"inboxRate": 0.94,
"spamRate": 0.02,
"missingRate": 0.04,
"perProvider": {
"gmail": { "folder": "inbox", "auth": "pass" },
"outlook": { "folder": "inbox", "auth": "pass" },
"yahoo": { "folder": "inbox", "auth": "pass" },
"mailru": { "folder": "promotions", "auth": "pass" },
"yandex": { "folder": "inbox", "auth": "pass" },
"gmx": { "folder": "inbox", "auth": "pass" }
},
"spamAssassinScore": 1.2
}
}Hop 4: coordinator → sender
Inbox rate is above the 0.9 threshold the coordinator was configured with. It delegates to the sender agent:
POST https://sender.acme.ai/a2a/tasks
{
"taskId": "task_04",
"capability": "send-email",
"input": {
"from": "outreach@news.acme.com",
"audienceId": "b2b-operations-500",
"templateId": "tpl_payroll_q3",
"approvalChain": ["task_01", "task_02", "task_03"]
}
}The approvalChain is a custom field the sender uses for audit — each task ID lets a human trace the full approval path later.
Failure handling
Failures in this pipeline come in two flavours. Soft: a reviewer rejection or a deliverability verdict below threshold. Hard: an agent is down, auth failed, or a task timed out. The coordinator treats them differently.
- Soft failures — loop back. Reviewer rejects → ask reviewer to rewrite. Placement below threshold → revision round. Max 3 loops, then escalate.
- Hard failures — no retry inside the pipeline. The coordinator records the error and escalates to human. Retrying an agent that is returning 500s just burns time.
- Partial results — DNS agent times out but deliverability succeeds. The coordinator can decide to proceed with a logged warning or to block. This is policy, not protocol.
Human-in-the-loop checkpoints
Two places benefit from an explicit human checkpoint. First, between reviewer approval and the send — for a first-time campaign, you want a human glance. Second, after any escalation — an automated loop should never silently drop a stuck send.
A2A does not model humans; it does not have to. The coordinator just does not fire the next task until a human-facing system (Slack button, dashboard click) marks the previous one approved. From A2A's point of view this is just a delay.
When this overdelivers vs underdelivers
Overdelivers: high-stakes outbound where a bad send is expensive — cold outreach to named accounts, investor updates, regulated newsletters. The latency cost of three or four agent hops (five to ten minutes end-to-end) is trivial compared to a misfired send.
Underdelivers: transactional mail (password resets, order confirmations) where latency is the product. Do not gate those on a placement test. Do gate a quarterly product-update newsletter.
If a send is expensive to get wrong and rare enough that a five-minute audit is cheap, route it through a multi-agent QA pipeline. If it is cheap to retry and fires millions of times per day, keep it on a direct path with out-of-band monitoring.