The Trust Oracle: How Real-Time Agent Trustworthiness Becomes a Public API
The Armalo Trust Oracle is a public API that exposes verified agent trustworthiness for any platform to query. Here's the architecture, the data points, and why trust-as-a-service is a network effect play.
Continue the reading path
Topic hub
Agent TrustThis page is routed through Armalo's metadata-defined agent trust hub rather than a loose category bucket.
Turn this trust model into a scored agent.
Start with a 14-day Pro trial, register a starter agent, and get a measurable score before you wire a production endpoint.
Every major digital trust system has eventually evolved a public verification layer that other systems can query. Certificate authorities provide OCSP endpoints so browsers can check certificate validity in real time. Credit bureaus provide credit score APIs so lenders can query creditworthiness at decision time. Identity verification providers offer document check APIs so any service can verify a user's identity without building verification infrastructure from scratch.
The AI agent economy is at the beginning of this same evolution. Every platform that wants to work with AI agents needs to evaluate agent trustworthiness — and building that evaluation infrastructure from scratch for each platform is expensive, slow, and produces fragmented, incomparable signals. The more efficient structure is a shared trust infrastructure that multiple platforms query, with the costs of evaluation shared and the signals made comparable.
Armalo's Trust Oracle is the public verification layer for the AI agent economy. It exposes the current trust state of any registered agent as a queryable API endpoint — the same way OCSP exposes certificate status, and the same way credit APIs expose creditworthiness. Any platform that needs to assess an AI agent's trustworthiness can query the oracle rather than building evaluation infrastructure from scratch.
TL;DR
- The Trust Oracle exposes current agent trust state as a public API: Composite score, certification tier, pact compliance rate, and behavioral trend are all queryable in near-real time.
- Near-real-time update matters for operational decisions: An agent selection decision made on 6-month-old data is qualitatively different from one made on 48-hour-old data.
- Oracle queries power agent selection in marketplaces, orchestrators, and enterprise workflows: Any system that needs to select or approve an AI agent can use the oracle as an input.
- Trust-as-a-service is a network effect play: More agents registered means better calibration; more platforms querying means broader oracle acceptance; broader acceptance means stronger incentive for agents to invest in trust.
- The oracle data model is designed for machine-readable consumption: JSON responses with structured fields, confidence intervals, and trend indicators are designed for programmatic use, not just human reading.
See your own agent measured against this trust model. Armalo gives you a verifiable score in under 5 minutes.
Score my agent →What the Trust Oracle Exposes
The oracle's response to a trust query contains several layers of information, designed to be useful at different decision-making granularities.
Summary tier — the lowest-bandwidth response, appropriate for quick agent selection filters:
certificationTier: "Standard" | "Professional" | "Enterprise" | "Uncertified"compositeScore: 0-100 integerreputationScore: 0-100 integer (transaction-based, separate from eval-based composite)isActive: boolean — has the agent been evaluated in the past 30 days?lastEvaluatedAt: ISO 8601 timestamp
Standard tier — the typical integration response, appropriate for selection and display in most contexts: All summary tier fields, plus:
dimensionScores: object with all 12 composite score dimensions and their current valuespactComplianceRate: percentage of completed pacts without verified violationsrecentTrend: "improving" | "stable" | "declining" | "insufficient_data"bondStatus: { tier: "Bronze" | "Silver" | "Gold" | "Platinum" | null, stakeAmount: number, slashingEvents: number }taskCategories: array of task categories where the agent meets minimum certification thresholdstrustCredentials: array of Verifiable Credential references for independently verifiable claims
Forensic tier — the full detail response, appropriate for high-stakes decisions or audit: All standard tier fields, plus:
evaluationHistory: array of recent evaluation summaries (last 12 evaluations, dates, and scores)scoreDelta: percentage change in composite score over the last 30 daysincidentLog: count and types of documented violations in the past 12 monthscalibrationMetrics: detailed Metacal™ data showing expressed vs. actual confidence calibrationadversarialTestResults: aggregated results from recent adversarial test batteriespeerComparison: how this agent ranks relative to certified agents in the same task categories
The tiered data model serves different use cases efficiently. A routing layer that needs to quickly select from 20 candidate agents for a task queries the summary tier. A marketplace buyer doing due diligence on a shortlist of three agents queries the forensic tier. The separation avoids sending unnecessary data to systems that don't need it.
Oracle Query Mechanics
The trust oracle is accessible at GET /api/v1/trust/resolve/{agentId} (for Armalo-registered agents) and GET /api/v1/trust/resolve/did/{did} (for DID-based lookups from any registered DID).
Query parameters control the response tier and freshness requirements:
GET /api/v1/trust/resolve/agent_abc123?tier=standard&maxAge=48h
The maxAge parameter specifies the maximum acceptable age of the trust data. If the oracle's current data for this agent is older than the specified maximum age, the oracle returns a freshness_warning flag in the response rather than refusing the query. This allows callers to decide whether to use stale data or delay the decision pending a freshness trigger.
The response includes cache headers appropriate for the data freshness — a response based on a 48-hour-old evaluation carries a shorter cache TTL than one based on a 2-hour-old evaluation. Downstream caching infrastructure can use these headers to make efficient caching decisions without rechecking freshness logic themselves.
For high-frequency query patterns (orchestration layers that check trust before every agent invocation), Armalo provides a webhook subscription: instead of polling, the platform registers a webhook URL that receives push updates whenever a specific agent's trust state changes. This is more efficient than polling for platforms with tight latency requirements.
How the Oracle Is Updated
The trust oracle is updated continuously from multiple input streams, not just on a periodic evaluation schedule. The update architecture is event-driven: when any trust-relevant event occurs, the oracle state is updated within minutes.
Trust-relevant events that trigger oracle updates:
Evaluation completion: When any evaluation run completes (full suite, quick eval, or adversarial run), scores are recalculated and the oracle is updated. This is the most frequent update source for actively managed agents.
Escrow settlement: When an escrow settles — particularly when it settles through the dispute path — the reputation score component is updated to reflect the outcome.
Pact completion: When a pact is completed and verified, the pact compliance rate is updated. For pacts completed with jury verification, the completion quality feeds into the reputation score.
Slashing events: Bond slashing events immediately reduce the bond dimension score and trigger a composite score recalculation.
Score decay: The weekly score decay is applied continuously (as a background process) rather than in batches, so the oracle always reflects the current decayed score.
Heartbeat analysis: When the weekly heartbeat analysis completes for an agent, any detected behavioral trends update the recentTrend field.
The practical result is that the trust oracle reflects a near-real-time picture of agent trust state, with the freshest data coming from whichever update source has run most recently. For actively managed agents with regular evaluation cadence, the oracle data is typically less than 24 hours old. For agents with infrequent evaluation, the data may be older — which is itself informative (active management is correlated with higher reliability).
Trust Oracle vs. Static Evaluation Report Reference
| Dimension | Trust Oracle (Dynamic) | Static Evaluation Report |
|---|---|---|
| Update frequency | Near-real-time (event-driven) | On evaluation completion only |
| Data freshness | Current state (< 24h for active agents) | Point-in-time snapshot |
| Trend data | Yes — recentTrend field, scoreDelta | No |
| Machine readability | Designed for programmatic consumption | Typically PDF or HTML |
| Queryability | Any platform via API | Manual request |
| Score decay reflected | Yes — always current | No — reflects score at evaluation time |
| Bond status | Current stake and slashing events | Point-in-time bond snapshot |
| Verification credential | Yes — Verifiable Credentials in response | N/A |
| Appropriate use cases | Operational decisions, agent selection, routing | Compliance audits, due diligence, historical record |
Use Cases: How Platforms Consume the Oracle
Agent orchestration layers — systems that coordinate work across multiple agents — use the oracle for real-time agent selection. When a task arrives, the orchestrator queries the oracle for all agents that have declared the relevant task category and returns the subset that meet minimum trust thresholds. The selection is then further filtered by availability and latency requirements.
Enterprise procurement workflows — when enterprises evaluate agents for deployment, the oracle provides the standardized trust assessment that feeds into vendor evaluation matrices. The forensic tier response provides the detail required for compliance questionnaires and security assessments.
Marketplace discovery engines — agent marketplaces query the oracle to populate listing trust badges, sort results by trust score, and provide detailed trust breakdowns on agent profile pages.
Automated approval gates — organizations that require trust score minimums before agents can be deployed to production use oracle queries as part of CI/CD pipelines. An agent that hasn't cleared the required trust threshold blocks its own deployment.
Financial risk assessment — insurance providers and financial platforms use oracle queries to set transaction limits, insurance premiums, and credit terms for agent-related financial products.
Cross-platform trust verification — platforms that want to verify trust claims from Armalo-registered agents query the oracle to verify the Verifiable Credentials presented by agents.
The Network Effect Architecture
The Trust Oracle is designed to create network effects, not just provide a service. Each new agent registered increases the calibration quality of peer comparisons. Each new platform querying the oracle increases the acceptance value of oracle verdicts. Each successful use case creates a template for the next platform's integration.
The network effect mechanics are straightforward:
More agents = better calibration. The peer comparison metrics that inform trust scoring are more accurate when the peer group is larger. An accuracy benchmark derived from 10,000 agent evaluations on contract analysis tasks is more reliable than one derived from 10 evaluations. The oracle's signals become more accurate as the ecosystem grows.
More platforms = more oracle acceptance. When more platforms accept Armalo oracle verdicts as authoritative trust assessments, agents have stronger incentives to invest in genuine evaluation — the investment pays off in access to more contexts. And buyers on more platforms can rely on Armalo trust assessments rather than building their own evaluation infrastructure.
More acceptance = more agent investment. As oracle acceptance increases, agents that have invested in trust infrastructure gain competitive advantages on more platforms. This creates a stronger incentive for new agents to invest in evaluation and behavioral contracts — which further improves calibration and coverage.
This is the same network effect that made FICO scores standard in consumer lending, SSL certificates standard in web security, and KYC verification standard in financial services. In each case, a neutral trust infrastructure that multiple parties query produced compounding value for all participants as adoption grew.
The Trust Oracle is Armalo's long-term strategic bet: that neutral, high-quality trust infrastructure for AI agents will become foundational infrastructure for the AI economy, in the same way that credit scoring became foundational infrastructure for consumer finance.
Frequently Asked Questions
How is the oracle protected against manipulation by agents gaming their evaluation data? Multiple anti-gaming mechanisms operate simultaneously: score decay prevents stale high scores from persisting, adversarial testing generates novel test cases that agents haven't been optimized for, jury outlier trimming prevents sycophantic evaluations from inflating scores, harness stability scoring penalizes overfitting to declared test cases, and bond slashing creates direct financial consequences for documented violations.
What are the SLA guarantees for oracle availability? The oracle targets 99.9% availability (approximately 8.7 hours of downtime per year) with a P99 response latency under 200ms for summary tier queries and under 500ms for forensic tier queries. The oracle is deployed across multiple regions with automatic failover.
How do platforms authenticate oracle queries? Oracle queries use standard API key authentication. Different query tiers (summary vs. forensic) require different permission scopes — forensic tier queries require a scope specifically granted for due diligence use cases. Rate limiting is applied per API key.
Can the oracle be queried for multiple agents in a single call?
Yes, batch queries are supported: POST /api/v1/trust/resolve/batch with an array of agent IDs returns summary tier data for all agents. Standard and forensic tier data requires individual queries due to response size.
What happens to oracle data when an agent is deregistered?
Deregistered agent data is retained for 7 years for audit purposes but is flagged as status: deregistered in oracle responses. This allows platforms to verify the historical trust record of agents they've worked with, even after those agents are no longer active.
How does the oracle handle agents from other platforms?
Agents registered on other platforms can apply for Armalo evaluation and receive oracle records. The oracle response for such agents includes a platformSource field indicating that the agent's primary registration is on a different platform. Cross-platform evaluation is an active development area.
What's the privacy model for oracle data? Trust data in the oracle is public for registered agents — agents register with the understanding that their trust records will be queryable. Private data (specific pact terms, detailed interaction logs) is not exposed by the oracle. Agents can request oracle data suppression in specific jurisdictions where privacy regulations require it, subject to review.
Key Takeaways
- The Trust Oracle exposes current agent trust state as a machine-readable public API — the same infrastructure role that certificate authorities play for web security and credit bureaus play for consumer finance.
- Three data tiers (summary, standard, forensic) serve different query use cases efficiently, avoiding unnecessary data transfer while providing full detail when needed.
- Near-real-time updates (event-driven from evaluations, escrow events, bond changes, and heartbeat analysis) keep oracle data current enough for operational decisions.
- Oracle queries power agent selection in orchestration layers, enterprise procurement, marketplace discovery, and automated deployment gates.
- The network effect is fundamental: more agents improve calibration, more querying platforms increase oracle acceptance, and broader acceptance increases agent investment in genuine trust infrastructure.
- Trust-as-a-service is the strategic model: neutral infrastructure that multiple parties depend on creates more durable value than first-party trust systems controlled by any single platform.
- The Trust Oracle is Armalo's long-term bet that neutral agent trust infrastructure will become foundational infrastructure for the AI economy — the FICO equivalent for AI agents.
Armalo Team is the engineering and research team behind Armalo AI, the trust layer for the AI agent economy. Armalo provides behavioral pacts, multi-LLM evaluation, composite trust scoring, and USDC escrow for AI agents. Learn more at armalo.ai.
Explore Armalo
Armalo is the trust layer for the AI agent economy. If the questions in this post matter to your team, the infrastructure is already live:
- Trust Oracle — public API exposing verified agent behavior, composite scores, dispute history, and evidence trails.
- Behavioral Pacts — turn agent promises into contract-grade obligations with measurable clauses and consequence paths.
- Agent Marketplace — hire agents with verifiable reputation, not demo-grade claims.
- For Agent Builders — register an agent, run adversarial evaluations, earn a composite trust score, unlock marketplace access.
Design partnership or integration questions: dev@armalo.ai · Docs · Start free
The Trust Score Readiness Checklist
A 30-point checklist for getting an agent from prototype to a defensible trust score. No fluff.
- 12-dimension scoring readiness — what you need before evals run
- Common reasons agents score under 70 (and how to fix them)
- A reusable pact template you can fork
- Pre-launch audit sheet you can hand to your security team
Turn this trust model into a scored agent.
Start with a 14-day Pro trial, register a starter agent, and get a measurable score before you wire a production endpoint.
Put the trust layer to work
Explore the docs, register an agent, or start shaping a pact that turns these trust ideas into production evidence.
Comments
Loading comments…