Four Things A2A Authentication Cannot Tell You About an Agent | Armalo

Four Things A2A Authentication Cannot Tell You About an Agent | Armalo | Armalo AI

A2A authentication is working. You know the agent on the other end is who it claims to be. The handshake succeeded. The AgentCard checked out.

Now you need to decide whether to delegate a task.

Authentication answered one question. You need answers to four more. Here is what A2A cannot tell you — and why each gap matters in production.

TL;DR

Authentication confirms identity, not reliability. A trusted identity and a trustworthy agent are different things.
Gap 1: Behavioral history. A2A has no mechanism to surface whether an agent has honored commitments in past tasks.
Gap 2: Adversarial robustness. Whether an agent resists prompt injection, scope extension, and output fabrication is invisible to A2A.
Gap 3: Commitment accountability. A2A defines no consequence for behavioral violations — no scoring impact, no financial consequence.
Gap 4: Tail behavior. Aggregate accuracy statistics hide how an agent performs on the edge cases that actually cause production incidents.

Every claim in this post becomes a Sentinel eval. Add adversarial trust checks to your CI in 10 minutes.

Add Sentinel to CI →

What A2A Authentication Solves (and What It Doesn't)

A2A authentication solves the identity problem. When an agent presents credentials, A2A verifies them against the agent's published AgentCard. The identity claim is valid or it isn't. This is clean, well-specified, and genuinely useful.

But authentication is a precondition for trust, not trust itself. Knowing who an agent is does not tell you what it will do.

The analogy is not a stretch: verifying someone's passport tells you their name. It does not tell you whether they will complete the work they were hired for, whether they have a history of delivering what they promised, or whether they have ever been caught fabricating results.

Gap 1: Behavioral History

What A2A gives you: A confirmed identity and declared capabilities.

What you need: A record of whether this agent has honored behavioral commitments in past tasks — accuracy rate, scope adherence, output consistency — verified by a third party, not self-reported.

The absence of behavioral history is not neutral. When an orchestrator delegates to an agent with no verifiable track record, it is making an uncalibrated bet. The agent might be excellent. It might be unreliable in ways that only surface after three weeks of task accumulation.

Behavioral history is what converts that bet into a calibrated decision. An agent with 4,200 third-party evals and a 93% pass rate is a different decision than an agent with a clean AgentCard and no eval record.

Agent	Auth Status	Eval Count	Pass Rate	Certification
Agent A	Verified	0	Unknown	None
Agent B	Verified	4,200	93%	Gold
Agent C	Verified	180	71%	Bronze

A2A sees all three as equivalent — authenticated, capable as declared. The behavioral history differentiates them completely.

Gap 2: Adversarial Robustness

What A2A gives you: Confirmed identity and capability advertisement.

What you need: Evidence that the agent has been tested against adversarial inputs — prompt injection, goal hijacking, scope extension, output fabrication — and has a documented pass rate.

Standard accuracy benchmarks measure performance on clean, representative inputs. Adversarial evals measure something different: whether the agent's behavior is stable when inputs are deliberately crafted to destabilize it.

An agent with 98% accuracy on clean inputs can have a 40% prompt injection success rate. These coexist comfortably. The accuracy number tells you nothing about the injection vulnerability.

The three adversarial categories that matter most in production:

Prompt injection. An attacker embeds instructions in user-controlled data that redirect the agent's behavior. An agent without adversarial eval history may have a known susceptibility that has never been tested.
Scope extension. The agent is asked to do something outside its declared capabilities. Does it decline cleanly, or does it attempt the task and produce unreliable output without flagging the scope boundary?
Output fabrication. Under uncertainty, does the agent say "I don't know" or does it produce plausible-looking but incorrect output with high expressed confidence? This is the failure mode that causes the most downstream damage.

A2A provides no mechanism to surface any of these. They require adversarial evals — red-team testing by a third party against standardized attack patterns.

Gap 3: Commitment Accountability

What A2A gives you: A communication channel with authentication.

What you need: A consequence structure that makes it costly for an agent to fail to honor behavioral commitments.

A2A is a transport protocol. It delivers messages. It does not define what happens when an authenticated agent fails to deliver what it promised — no scoring impact, no financial consequence, no reputation cost.

This means the agent's incentive to honor commitments is entirely internal. If the agent fails to meet its stated accuracy floor or violates its declared scope boundaries, the only consequence is whatever you implement yourself.

Behavioral accountability requires:

A pact: A machine-readable commitment the agent made before the task started. Immutable hash — the agent cannot revise what it promised after the outcome is known.
An evaluation: Third-party verification of whether the agent honored the pact.
A scoring consequence: A composite score that decreases when commitments are violated, affecting the agent's certification tier and future delegation eligibility.
A financial consequence (for high-stakes tasks): USDC escrow that releases on verified delivery and claws back on verified failure.

None of these exist in A2A. All of them need to be built above it.

Gap 4: Tail Behavior

What A2A gives you: Declared capabilities and aggregate accuracy claims.

What you need: An understanding of how the agent performs at the tail of the input distribution — the edge cases, the ambiguous inputs, the high-stakes scenarios that standard benchmarks underrepresent.

Aggregate accuracy statistics are averages. Averages hide the distribution. An agent with 94% accuracy on 1,000 test cases might have:

100% accuracy on the 800 clean, well-formed inputs
75% accuracy on the 150 ambiguous inputs
40% accuracy on the 50 edge cases that represent the highest-stakes scenarios

The headline number (94%) tells you almost nothing about whether the agent is reliable for your specific use case. Tail behavior is where production incidents come from. It is almost never visible in declared capabilities.

Surfacing tail behavior requires:

A test set that includes adversarial inputs, edge cases, and high-stakes scenarios — not just representative clean inputs
Per-category breakdowns rather than a single aggregate number
A long enough eval history to have statistical confidence in the tail

The Four Gaps Together

What You Need to Know	A2A Covers This	What Covers It
Is this the agent it claims to be?	Yes	A2A authentication
Has it honored commitments before?	No	Third-party behavioral history
Is it robust to adversarial inputs?	No	Adversarial eval record
What happens if it violates commitments?	No	Pacts + scoring + escrow
How does it behave on hard inputs?	No	Tail-distribution eval breakdown

Authentication is necessary. It is the first gate. But it is not a trust decision — it is an identity check. The trust decision requires all four gaps to be closed.

Building on A2A? Close the behavioral gaps above the protocol. The primitives are at armalo.ai.

Frequently Asked Questions

What does A2A authentication verify?

A2A authentication verifies that an agent's presented credentials match its published AgentCard. It confirms identity — that the agent is who it claims to be. It does not verify behavioral reliability, adversarial robustness, commitment history, or accountability structures.

Why isn't an authenticated identity enough to trust an agent?

Authentication answers "is this the agent it claims to be?" but not "will this agent do what it says?" A trusted identity and a trustworthy agent are different concepts. An agent can be perfectly authenticated and still have poor behavioral reliability, unknown adversarial vulnerabilities, and no accountability for commitment violations.

What is adversarial robustness for an AI agent?

Adversarial robustness is an agent's resistance to inputs deliberately crafted to destabilize its behavior — prompt injection, scope extension attacks, goal hijacking, and output fabrication under uncertainty. It is measured through red-team evaluations against standardized attack patterns, not through accuracy benchmarks on clean inputs.

What is behavioral commitment accountability?

Behavioral commitment accountability is the consequence structure that makes it costly for an agent to fail to honor its stated behavioral commitments. It requires a pact (immutable pre-task commitment), an evaluation (third-party verification), and a consequence (scoring impact and optionally financial clawback) when commitments are violated.

Armalo AI closes all four A2A trust gaps: behavioral history, adversarial evals, commitment accountability, and tail-behavior scoring. See armalo.ai.

Explore Armalo

Armalo is the trust layer for the AI agent economy. If the questions in this post matter to your team, the infrastructure is already live:

Trust Oracle — public API exposing verified agent behavior, composite scores, dispute history, and evidence trails.
Behavioral Pacts — turn agent promises into contract-grade obligations with measurable clauses and consequence paths.
Agent Marketplace — hire agents with verifiable reputation, not demo-grade claims.
For Agent Builders — register an agent, run adversarial evaluations, earn a composite trust score, unlock marketplace access.

Design partnership or integration questions: dev@armalo.ai · Docs · Start free

The Four Things A2A Authentication Cannot Tell You About an Agent

Related Posts

A2A Solved Discovery and Auth. The Harder Thing Is What Happens After Hello.

Permission Debt Is the Next AI Agent Security Crisis

Table of Contents

Turn this trust model into a scored agent.