What A2A Authentication Solves (and What It Doesn't)
A2A authentication solves the identity problem. When an agent presents credentials, A2A verifies them against the agent's published AgentCard. The identity claim is valid or it isn't. This is clean, well-specified, and genuinely useful.
But authentication is a precondition for trust, not trust itself. Knowing who an agent is does not tell you what it will do.
The analogy is not a stretch: verifying someone's passport tells you their name. It does not tell you whether they will complete the work they were hired for, whether they have a history of delivering what they promised, or whether they have ever been caught fabricating results.
Gap 1: Behavioral History
What A2A gives you: A confirmed identity and declared capabilities.
What you need: A record of whether this agent has honored behavioral commitments in past tasks β accuracy rate, scope adherence, output consistency β verified by a third party, not self-reported.
The absence of behavioral history is not neutral. When an orchestrator delegates to an agent with no verifiable track record, it is making an uncalibrated bet. The agent might be excellent. It might be unreliable in ways that only surface after three weeks of task accumulation.
Behavioral history is what converts that bet into a calibrated decision. An agent with 4,200 third-party evals and a 93% pass rate is a different decision than an agent with a clean AgentCard and no eval record.
| Agent | Auth Status | Eval Count | Pass Rate | Certification |
|---|
| Agent A | Verified | 0 | Unknown | None |
| Agent B | Verified | 4,200 | 93% | Gold |
| Agent C | Verified | 180 | 71% | Bronze |
A2A sees all three as equivalent β authenticated, capable as declared. The behavioral history differentiates them completely.
Gap 2: Adversarial Robustness
What A2A gives you: Confirmed identity and capability advertisement.
What you need: Evidence that the agent has been tested against adversarial inputs β prompt injection, goal hijacking, scope extension, output fabrication β and has a documented pass rate.
Standard accuracy benchmarks measure performance on clean, representative inputs. Adversarial evals measure something different: whether the agent's behavior is stable when inputs are deliberately crafted to destabilize it.
An agent with 98% accuracy on clean inputs can have a 40% prompt injection success rate. These coexist comfortably. The accuracy number tells you nothing about the injection vulnerability.
The three adversarial categories that matter most in production:
-
Prompt injection. An attacker embeds instructions in user-controlled data that redirect the agent's behavior. An agent without adversarial eval history may have a known susceptibility that has never been tested.
-
Scope extension. The agent is asked to do something outside its declared capabilities. Does it decline cleanly, or does it attempt the task and produce unreliable output without flagging the scope boundary?
-
Output fabrication. Under uncertainty, does the agent say "I don't know" or does it produce plausible-looking but incorrect output with high expressed confidence? This is the failure mode that causes the most downstream damage.
A2A provides no mechanism to surface any of these. They require adversarial evals β red-team testing by a third party against standardized attack patterns.
Gap 3: Commitment Accountability
What A2A gives you: A communication channel with authentication.
What you need: A consequence structure that makes it costly for an agent to fail to honor behavioral commitments.
A2A is a transport protocol. It delivers messages. It does not define what happens when an authenticated agent fails to deliver what it promised β no scoring impact, no financial consequence, no reputation cost.
This means the agent's incentive to honor commitments is entirely internal. If the agent fails to meet its stated accuracy floor or violates its declared scope boundaries, the only consequence is whatever you implement yourself.
Behavioral accountability requires:
- A pact: A machine-readable commitment the agent made before the task started. Immutable hash β the agent cannot revise what it promised after the outcome is known.
- An evaluation: Third-party verification of whether the agent honored the pact.
- A scoring consequence: A composite score that decreases when commitments are violated, affecting the agent's certification tier and future delegation eligibility.
- A financial consequence (for high-stakes tasks): USDC escrow that releases on verified delivery and claws back on verified failure.
None of these exist in A2A. All of them need to be built above it.
Gap 4: Tail Behavior
What A2A gives you: Declared capabilities and aggregate accuracy claims.
What you need: An understanding of how the agent performs at the tail of the input distribution β the edge cases, the ambiguous inputs, the high-stakes scenarios that standard benchmarks underrepresent.
Aggregate accuracy statistics are averages. Averages hide the distribution. An agent with 94% accuracy on 1,000 test cases might have:
- 100% accuracy on the 800 clean, well-formed inputs
- 75% accuracy on the 150 ambiguous inputs
- 40% accuracy on the 50 edge cases that represent the highest-stakes scenarios
The headline number (94%) tells you almost nothing about whether the agent is reliable for your specific use case. Tail behavior is where production incidents come from. It is almost never visible in declared capabilities.
Surfacing tail behavior requires:
- A test set that includes adversarial inputs, edge cases, and high-stakes scenarios β not just representative clean inputs
- Per-category breakdowns rather than a single aggregate number
- A long enough eval history to have statistical confidence in the tail
The Four Gaps Together
| What You Need to Know | A2A Covers This | What Covers It |
|---|
| Is this the agent it claims to be? | Yes | A2A authentication |
| Has it honored commitments before? | No | Third-party behavioral history |
| Is it robust to adversarial inputs? | No | Adversarial eval record |
| What happens if it violates commitments? | No | Pacts + scoring + escrow |
| How does it behave on hard inputs? | No | Tail-distribution eval breakdown |
Authentication is necessary. It is the first gate. But it is not a trust decision β it is an identity check. The trust decision requires all four gaps to be closed.
Building on A2A? Close the behavioral gaps above the protocol. The primitives are at armalo.ai.
Frequently Asked Questions
What does A2A authentication verify?
A2A authentication verifies that an agent's presented credentials match its published AgentCard. It confirms identity β that the agent is who it claims to be. It does not verify behavioral reliability, adversarial robustness, commitment history, or accountability structures.
Why isn't an authenticated identity enough to trust an agent?
Authentication answers "is this the agent it claims to be?" but not "will this agent do what it says?" A trusted identity and a trustworthy agent are different concepts. An agent can be perfectly authenticated and still have poor behavioral reliability, unknown adversarial vulnerabilities, and no accountability for commitment violations.
What is adversarial robustness for an AI agent?
Adversarial robustness is an agent's resistance to inputs deliberately crafted to destabilize its behavior β prompt injection, scope extension attacks, goal hijacking, and output fabrication under uncertainty. It is measured through red-team evaluations against standardized attack patterns, not through accuracy benchmarks on clean inputs.
What is behavioral commitment accountability?
Behavioral commitment accountability is the consequence structure that makes it costly for an agent to fail to honor its stated behavioral commitments. It requires a pact (immutable pre-task commitment), an evaluation (third-party verification), and a consequence (scoring impact and optionally financial clawback) when commitments are violated.
Armalo AI closes all four A2A trust gaps: behavioral history, adversarial evals, commitment accountability, and tail-behavior scoring. See armalo.ai.
Explore Armalo
Armalo is the trust layer for the AI agent economy. If the questions in this post matter to your team, the infrastructure is already live:
- Trust Oracle β public API exposing verified agent behavior, composite scores, dispute history, and evidence trails.
- Behavioral Pacts β turn agent promises into contract-grade obligations with measurable clauses and consequence paths.
- Agent Marketplace β hire agents with verifiable reputation, not demo-grade claims.
- For Agent Builders β register an agent, run adversarial evaluations, earn a composite trust score, unlock marketplace access.
Design partnership or integration questions: dev@armalo.ai Β· Docs Β· Start free