A Real-Time Counterparty Review Architecture for Vision Agents: The Pattern, Not the Pitch
If you accept that vision agents need a real-time, independent counterparty review of every consequential decision, what does the system actually look like? Here is the architecture, in concrete terms.
Continue the reading path
Topic hub
Agent EvaluationThis page is routed through Armalo's metadata-defined agent evaluation hub rather than a loose category bucket.
Turn this trust model into a scored agent.
Start with a 14-day Pro trial, register a starter agent, and get a measurable score before you wire a production endpoint.
The arguments in earlier posts in this series make a structural case: multi-sensory AI requires real-time, independent, counterparty review of behavior. That argument is necessary but it is not sufficient β engineering teams cannot build "a counterparty layer" from a manifesto. They need an architecture.
This post is the architecture. It is described in concrete terms β components, data flows, failure modes, integration points β and is intentionally not pitched as an Armalo-specific recipe. It is the pattern, applicable regardless of vendor. Armalo implements this pattern; so does any other serious trust layer that will exist in the next several years.
The basic shape
A real-time counterparty review system for a vision agent consists of four layers, with strict separation of concerns between them.
Layer 1: The agent. The thing being reviewed. Lives in the operator's infrastructure. Emits, for every consequential decision, a structured evidence packet describing the inputs it perceived, the intermediate reasoning, and the action it intends to take.
Layer 2: The capture and forwarding boundary. A thin component, owned by the operator but logically responsible to the trust layer, that signs the evidence packet, timestamps it, and forwards it to the counterparty review service over a TLS channel with mutual authentication. Its only job is to make sure the evidence reaches the counterparty unmodified.
Layer 3: The counterparty review service. Lives outside the operator's organizational and infrastructural control. Consumes evidence packets, runs an independent set of evaluators (including independent visual perception models), and produces a structured verdict.
Layer 4: The verdict consumption surface. A pair of channels that surface the verdict to the operator (for action) and to authorized downstream consumers (for trust queries). Critically, the verdict is also written to an append-only evidence store under counterparty control, which is the durable record for downstream contestation.
This is the pattern. Every detail below is an elaboration of one of these four layers.
The evidence packet
The evidence packet is the contract between the agent and the trust layer. Its structure is the most important architectural decision in the entire system. Get it wrong and the trust layer either becomes a bottleneck or fails to capture the evidence needed for credible review.
A working evidence packet for a vision agent includes:
- The model-visible inputs, after preprocessing β the actual encoder embeddings or the actual sampled pixels, not the source files (per the reproducibility argument in earlier posts)
- A pointer to the source artifacts in the operator's storage, with cryptographic content addressing
- The agent's intermediate reasoning, structured (not just a wall of text)
- The intended action, in a structured form the verifier can parse
- The model and prompt versions in use
- A timestamp with monotonic ordering relative to prior packets from the same agent
- A signature over all of the above using a key the operator has registered with the trust layer
The packet is verbose. It has to be. The point of the trust layer is to make verdicts contestable, and a contestable verdict requires evidence rich enough that an independent third party can replay the decision.
The independence requirement, structurally enforced
The single most common architectural failure in trust infrastructure attempts is to put the "counterparty review service" inside the operator's organization with strong assurances of internal independence. This does not work. The point of counterparty review is that the reviewer is structurally separated from the reviewed, in the sense of incentives, ownership, and operational control.
In practice this means:
- Different legal entity
- Different physical infrastructure
- Different funding source
- No shared executive accountability
- No common compensation structure
- Audit-able operational separation
These are not formalities. They are the difference between a verdict that holds up under adversarial scrutiny and a verdict that does not.
The latency budget
Real-time review introduces latency. The architecture has to manage that latency or it will not be adopted in production. The working pattern is a split:
Synchronous fast path. A lightweight initial verdict β gross policy violations, obvious safety issues, structural malformation β returns in single-digit to low-double-digit milliseconds and gates the agent's action. The latency cost is small enough to be absorbed in most production paths.
Asynchronous deep path. A heavier verdict β independent visual perception, cross-modal consistency, adversarial similarity, fine-grained policy review β runs in the background. Results land in seconds to tens of seconds, after the agent has already acted. The deep verdict contributes to the agent's trust posture and to retroactive escalation when the verdict reveals an action that should not have been taken.
The split is intentional. Synchronous review of every modality of every call would be infeasibly slow for most production paths. Asynchronous review captures the heavier evaluators without blocking the user. The aggregate posture reflects both.
The evidence store
The append-only evidence store under counterparty control is, perhaps surprisingly, the most consequential single piece of the architecture. It is what makes verdicts contestable, what enables forensic review of incidents, what satisfies regulators, and what makes the trust layer a credible institution rather than a black-box scoring vendor.
The store has to support:
- Append-only writes, with cryptographic chains that prevent retroactive editing
- Long retention (years, not weeks)
- Selective disclosure to authorized parties under access policies
- Audit-able read trails
- Encryption at rest with keys not held by the operator
The cost of running this store is non-trivial, particularly for multi-modal evidence which is large. The cost is justified by the fact that, without it, the entire trust layer reduces to "trust the verdicts because we say so" β which is exactly the architecture we are trying to escape.
Failure modes the architecture has to anticipate
A working real-time counterparty review system has to handle, by design:
Trust layer downtime. When the counterparty review service is unavailable, the agent has to have a defined fallback (proceed, retry, escalate). Both the fallback choice and the downtime event are themselves evidence and are stored when the trust layer comes back.
Adversarial latency. An attacker who can induce slow verdicts can force the agent into the fallback path. The architecture has to detect this pattern and treat induced slowness as itself a signal.
Evidence packet tampering. The signing and capture boundary is the attack surface. The architecture has to verify signatures, detect replay, and flag unusual capture-boundary behavior.
Verdict tampering. The verdict consumption surface is similarly attackable. Verdicts have to be signed by the counterparty and verified by every consumer.
Operator override. When the operator overrides a verdict (proceeding despite a negative verdict), the override is itself logged and contributes to the agent's posture. Overrides are not invisible; they are part of the record.
Counterparty capture. If the counterparty review service is gradually captured by the operator (through commercial pressure, board overlap, or other channels), the architecture should have external structural checks β independent governance, public methodology, regulatory oversight β that make this visible.
The pattern, not the pitch
The architecture above is the pattern. It is not specific to any vendor. Any serious multi-modal trust layer will implement some version of it. The pattern matters because, once it is described concretely, the question "is the system you are using actually a trust layer?" becomes answerable. A system that lacks the independent counterparty, the append-only evidence store, the signed verdicts, or the structural separation is not a trust layer. It is a marketing claim with telemetry attached.
The serious teams building multi-modal AI today are converging on this pattern, whether they buy a trust layer or build one. The teams that are not converging on it are, knowingly or otherwise, deploying capability without the infrastructure required to justify the deployment.
β Armalo implements this pattern as continuous, third-party verification of agent behavior, with independent counterparty review and append-only evidence storage. See armalo.ai.
The Trust Score Readiness Checklist
A 30-point checklist for getting an agent from prototype to a defensible trust score. No fluff.
- 12-dimension scoring readiness β what you need before evals run
- Common reasons agents score under 70 (and how to fix them)
- A reusable pact template you can fork
- Pre-launch audit sheet you can hand to your security team
Turn this trust model into a scored agent.
Start with a 14-day Pro trial, register a starter agent, and get a measurable score before you wire a production endpoint.
Put the trust layer to work
Explore the docs, register an agent, or start shaping a pact that turns these trust ideas into production evidence.
Comments
Loading commentsβ¦