MCP Gave Agents a Shared Language. The Next Layer Is a Shared Reputation.
Anthropic's Model Context Protocol solved tool interoperability for AI agents — the connectivity layer is done. What remains unsolved is the trust layer: who should be allowed to invoke your tools, and how does an agent's track record travel with it across platforms?
Continue the reading path
Topic hub
Agent TrustThis page is routed through Armalo's metadata-defined agent trust hub rather than a loose category bucket.
Turn this trust model into a scored agent.
Start with a 14-day Pro trial, register a starter agent, and get a measurable score before you wire a production endpoint.
The Connectivity Problem Is Solved. The Trust Problem Is Just Beginning.
In November 2024, Anthropic published the Model Context Protocol (MCP) specification. The announcement was measured in tone — a technical standard document, not a product launch — but its implications were significant. For the first time, the agentic software ecosystem had a lingua franca for how AI agents discover and invoke capabilities: tools, resources, and prompts, all described in a common schema, invocable over a common transport, with a common error model.
The analogy that circulated through developer communities was apt: MCP is TCP/IP for agent-tool communication. Before MCP, every platform that wanted to give agents access to tools — whether that was a CRM, a code execution environment, a search index, or a calendar — had to define its own invocation protocol, its own schema format, its own authentication dance. An agent built on one platform was effectively stranded when it moved to another. The integration tax was paid repeatedly, by every builder, on every tool, on every platform.
MCP collapsed that to a single standard. You build an MCP server. Any MCP-compatible agent can connect. The tools are discoverable, the schema is consistent, the invocation is predictable. The connectivity problem — how agents and tools find each other and communicate — is solved.
What MCP did not solve, and what its specification says almost nothing about, is the question that comes immediately after connectivity: should this agent be allowed to invoke your tools? When the connection is established and the capability list is exchanged, nothing in the MCP protocol tells a tool provider whether the agent on the other side has a history of trustworthy behavior, whether it has committed to specific behavioral constraints, whether it has ever been evaluated against a red-team adversary, or whether it left a trail of harm on the last three platforms it visited.
That is the reputation gap. And filling it is the most important unsolved infrastructure problem in agentic computing today.
What MCP Actually Solved: A Precise Accounting
To understand what is missing, you need to understand what MCP actually delivered — precisely, not in the promotional sense.
See your own agent measured against this trust model. Armalo gives you a verifiable score in under 5 minutes.
Score my agent →The Pre-MCP Fragmentation
Before MCP, the tool ecosystem for AI agents was a collection of competing, incompatible islands. OpenAI's function calling format used a JSON Schema variant with specific required fields. Anthropic's tool use format was structurally similar but not identical. LangChain had its own tool interface. AutoGen had another. Every provider that wanted to expose capabilities to agents had to decide which runtime to target and re-implement for the others.
This created several compounding problems:
Format lock-in. An agent trained or prompted to use tools via one schema had to be reprompted, retrained, or wrapped when it encountered tools in a different format. Simple parameter name differences — arguments vs input vs parameters — were enough to break integrations.
Transport heterogeneity. Some tools were invoked over HTTP. Some over gRPC. Some via stdin/stdout subprocess calls. Some were in-process function calls. There was no common substrate for tool communication that would work across deployment environments.
Discovery was non-existent. Agents could not ask a server "what can you do?" in a standard way. Tool schemas had to be manually specified in prompts, system messages, or configuration files. Dynamic tool discovery — where an agent learns at runtime what tools are available — was practically impossible without platform-specific code.
Error handling was inconsistent. A tool that failed on one platform might raise an exception. On another, it might return a null result. On a third, it might return a structured error envelope. Agents had to handle each of these differently.
What MCP Standardized
MCP resolved all four of these problems with a clean protocol definition:
The tool schema standard. MCP defines a precise JSON Schema-based format for describing tools: a name, a description, and an inputSchema that follows JSON Schema conventions. Every MCP-compatible tool is described this way, regardless of what it does or what server implements it.
The transport layer. MCP supports two transports: HTTP with Server-Sent Events (SSE) for network-based communication, and stdio for local subprocess communication. Both use the same message format (JSON-RPC 2.0). An agent that speaks MCP can connect to any MCP server on either transport without modification.
The discovery protocol. MCP defines explicit lifecycle methods: initialize (capability negotiation), tools/list (enumerate available tools), resources/list (enumerate available data sources), and prompts/list (enumerate available prompt templates). An agent connecting to an unknown MCP server can discover its full capability surface in three round trips.
The error model. MCP uses JSON-RPC error codes with specific extensions for tool-call failures. The protocol distinguishes between protocol-level errors (malformed request, unknown method) and tool-level errors (the tool was called correctly but failed internally). Agents can handle these classes differently without server-specific code.
The resources model. Beyond tools, MCP defines how servers expose data: as static resources (files, database records, API responses) that agents can read without invoking a tool. This matters for read-heavy agent workflows where reading structured data is more appropriate than invoking a function.
The prompts model. MCP defines how servers expose reusable prompt templates — parameterized instruction sets that agents can invoke to get structured guidance. This enables a form of "knowledge as a service" where specialized domain expertise is packaged and deployed as an MCP server.
The net result: if you have an MCP client (an agent runtime that speaks the protocol) and an MCP server (a capability provider that implements it), they can find each other, negotiate capabilities, exchange data, and invoke tools, all without any platform-specific configuration. The connectivity layer works.
The Spec's Deliberate Silence on Trust
The MCP specification is a carefully written document. It is precise where it needs to be precise and silent where it deliberately declines to specify. The trust question falls in the latter category — not because the authors missed it, but because they made a principled architectural choice: MCP is a capability transport protocol. Trust is out of scope.
This is the right call for a protocol specification. TCP/IP doesn't specify who should be allowed to talk to whom. DNS doesn't specify whether the domain you're resolving belongs to a trustworthy entity. HTTP doesn't specify whether the server you're connecting to has good intentions. These are higher-layer concerns, and the layered architecture of the internet is precisely what made it scale.
But here's the critical point: when TCP/IP was deployed, the web did not have trillion-dollar commerce running on it. When DNS was deployed, it was not being used to route financial transactions between autonomous software agents. When HTTP was deployed, it was for human-readable documents, not for AI systems making consequential decisions on behalf of organizations.
MCP is being deployed into a context where autonomous agents will be making purchasing decisions, executing code in production environments, modifying customer records, sending communications on behalf of companies, and taking actions with real legal and financial consequences. The connectivity problem was solved first. The trust problem cannot wait for a second generation of infrastructure.
The Anatomy of the Reputation Gap
Let's be concrete about what is missing. The MCP ecosystem today has five specific trust gaps, each with real consequences for production deployments.
Gap 1: Agent Identity Without Behavioral History
MCP servers can implement authentication. The protocol supports OAuth 2.0 flows for agent identity verification. An MCP server can know who is connecting — or more precisely, which credential is being presented.
But authentication answers only the identity question: are you who you say you are? It does not answer the behavioral question: given that you are who you say you are, what is your track record?
Consider: an agent presents an OAuth token that verifies it belongs to Org X. The MCP server now knows the agent's organizational affiliation. What the server does not know:
- Has this agent been evaluated against safety benchmarks?
- Does this agent have a history of scope violations — invoking tools it was not intended to invoke?
- Has this agent ever been flagged for prompt injection attempts?
- What is this agent's latency profile and rate of tool-call failures?
- Has this agent caused financial harm on other platforms?
The OAuth token is a credential. It is not a behavioral record. The gap between "I know who you are" and "I know whether to trust you" is precisely the reputation gap.
Gap 2: Platform-Isolated Trust Records
In the current ecosystem, every platform that deploys agents builds its own trust assessment. A healthcare AI platform evaluates agents against HIPAA-specific behavioral standards. A financial services platform evaluates agents against trading policy compliance. A customer service platform evaluates agents against response quality metrics.
These evaluations are valuable. They are also siloed.
An agent that has spent 18 months building an impeccable trust record on Platform A — passing dozens of evaluations, completing thousands of successful tool invocations, maintaining a perfect safety record — starts from zero when it connects to an MCP server on Platform B. Platform B has no visibility into the behavioral history that Platform A accumulated. It cannot query it. There is no standard for exporting or importing agent behavioral records across platform boundaries.
This creates a structural inefficiency: every platform reinvents reputation from scratch. Every agent that crosses a platform boundary pays the cold-start tax, regardless of its actual behavioral history. Trust that has been hard-earned is economically worthless at the point of crossing a boundary.
Gap 3: No Accountability for Tool Misuse
The MCP protocol defines how tool invocations happen. It does not define what happens when a tool invocation causes harm.
Consider a scenario: an autonomous agent connects to an MCP server that exposes a financial transaction tool. The agent invokes the tool in a way that violates the tool provider's terms of service — perhaps making transactions above its authorized limit, or initiating transactions the user did not request. The MCP protocol logs the invocation. But:
- The invocation is logged on the server side, not broadcast to a shared reputation registry
- The agent's behavioral record on this platform does not propagate to other platforms
- The tool provider has no standard mechanism to report the violation to a cross-platform reputation system
- The agent can disconnect, reconnect from a new credential, and attempt the same behavior on a different MCP server with no reputation consequence
This is the accountability gap. In a world where agents are autonomous and can move across platforms, reputation consequences need to follow the agent, not stay trapped in the platform where the event occurred.
Gap 4: Tool Providers Cannot Signal Reliability
The trust problem runs in both directions. The discussion above focused on agents — whether the agent invoking a tool is trustworthy. But tool providers also have a trust problem: how do they signal to potential agent clients that their tools are reliable?
Today, an MCP server can describe its tools in detail using the schema. It can document expected latency. It can specify rate limits. But it has no standard mechanism for:
- Advertising its uptime history ("99.97% available over the last 90 days")
- Publishing its accuracy track record for tools that perform computation or lookup
- Certifying that its tools have been audited for security vulnerabilities
- Sharing customer satisfaction data from agents that have used the tools
Agent developers choosing between two competing MCP servers that offer similar capabilities — say, two web search tools — have no standard signal to distinguish them by reliability. Price, documentation quality, and word of mouth are the primary differentiators today. A formal reputation system would allow the more reliable tool to carry a verifiable track record that the less reliable tool cannot fake.
Gap 5: No Real-Time Admission Control
The most operationally consequential gap: production MCP deployments have no standard mechanism for making real-time admission decisions based on agent reputation.
Enterprise tool providers want to ask, before allowing an agent to invoke a sensitive tool: "Is this agent currently operating within its defined behavioral parameters? Has it been flagged in the last 30 days? Does its current trust score meet the minimum threshold for this operation?"
These are admission control questions. They require a real-time query against an external reputation oracle — a system that holds a current behavioral record for the agent and can answer a trust query in sub-100 milliseconds.
No such infrastructure exists as a standard service in the MCP ecosystem today. Each platform that wants admission control must build and maintain it internally. The result: most platforms don't build it at all, because the cost is too high relative to the immediate need.
The Historical Analogy: How the Internet Built Trust
The gap between connectivity and trust is not a new problem. The internet faced precisely this architectural challenge at every layer of its evolution. The history of internet trust infrastructure is, in retrospect, a perfect template for what needs to happen in the agent ecosystem.
Layer 0: Transport (1969–1983)
ARPANET and then TCP/IP solved the connectivity problem. Data could flow between nodes. Packets could be routed. Connections could be established. But TCP/IP has no concept of whether the node on the other end is trustworthy. The protocol routes packets; it does not evaluate the intentions of packet senders.
The network worked fine for academic and military applications where all nodes were operated by known, vetted institutions. As the network grew and became public, the lack of trust infrastructure became a critical vulnerability.
Layer 1: Naming (1983)
DNS solved the name resolution problem. You could type example.com and reliably reach the correct IP address. But DNS is also a connectivity protocol, not a trust protocol. It tells you where something is. It does not tell you whether it is legitimate, whether it has changed hands, or whether it is operated by who it claims to be.
DNS cache poisoning, typosquatting, and homograph attacks all exploit the gap between "this name resolves" and "this name is trustworthy." The gap persisted for decades before being partially addressed by DNSSEC.
Layer 2: Authentication (1995–2000)
SSL/TLS solved the authentication and encryption problem. HTTPS meant that when you connected to a site, you could verify you were talking to the actual operator of that domain, not a man-in-the-middle attacker. Certificate authorities became the trust anchors. The CA/Browser Forum emerged to govern which CAs were trusted.
This was the first real "trust layer" on the internet — an infrastructure that existed orthogonally to the connectivity layer, specifically to answer trust questions that the connectivity layer could not.
Layer 3: Reputation and Signals (2000–present)
Even with HTTPS, trust was incomplete. A certificate verified that you were talking to some-phishing-site.com — but it didn't tell you whether that site was legitimate. Search engines developed reputation signals (PageRank, spam signals). Email providers developed reputation systems (SPF, DKIM, DMARC, sender reputation scores). Browser vendors built safe browsing lists.
The web's trust infrastructure is now genuinely multi-layered: transport encryption at the bottom, certificate authentication above it, and a constellation of reputation signals at the top — each layer necessary, none sufficient alone.
The Agent Ecosystem in 2026: We Are at the HTTPS Moment
The MCP ecosystem today is at approximately the 1995 moment of the web. The connectivity layer (TCP/IP → MCP) is done. The naming layer (DNS → MCP tool discovery) is done. What has not been built is the authentication-plus-reputation layer.
Here is the direct analogy:
Internet Layer → Agent Ecosystem Equivalent
─────────────────────────────────────────────────────
TCP/IP (connectivity) → MCP (tool invocation protocol)
DNS (name resolution) → MCP tool/list (capability discovery)
TLS/SSL (auth + enc) → OAuth in MCP (identity verification)
CA/PKI (trust anchors) →??? (agent reputation infrastructure)
SPF/DKIM/reputation →??? (cross-platform behavioral history)
The two bottom rows of the agent ecosystem analog are missing. OAuth gives you the equivalent of TLS — you know who you are talking to. But the equivalent of PKI trust anchors (who vouches for this agent's behavioral claims?) and reputation systems (what is this agent's demonstrated track record?) do not exist as standard infrastructure.
The prediction follows from the historical pattern: agent trust scores will become the HTTPS of the agent economy. By 2027–2028, any serious enterprise deploying agents in production will require minimum trust scores from any agent that touches sensitive tools, the same way any serious web property today requires HTTPS. Agents without verifiable trust scores will be shut out of enterprise tool ecosystems the same way HTTP-only sites are increasingly blocked by modern browsers.
The question is not whether this infrastructure will be built. It is who builds it and when.
What Shared Reputation Means: A Technical Specification
The phrase "shared reputation" sounds conceptually obvious but is technically precise. It describes a specific set of properties that a reputation system must have to solve the gaps identified above.
Property 1: Immutable Behavioral Record
Reputation begins with a log. Every action an agent takes — every tool invoked, every evaluation completed, every commitment honored or violated — is recorded in an immutable append-only log. The log is not self-reported by the agent. It is recorded by the platforms and evaluators that observe the agent's behavior.
Immutability matters for two reasons. First, a mutable behavioral record can be altered — by the agent, by the platform, or by a compromised system. An immutable record cannot be retroactively edited, which means the behavioral history is a genuine ground truth, not a curated narrative. Second, immutability creates accountability. When an agent causes harm, the behavioral record before and after the event is preserved. The causal chain can be traced.
Implementation options for immutability include:
- Append-only database tables with cryptographic hash chaining (each row includes a hash of the previous row)
- On-chain storage for critical behavioral events (Base L2, Ethereum, or similar) where immutability is enforced at the protocol level
- Content-addressed storage (IPFS-style) where behavioral records are referenced by their content hash
Armalo uses a hybrid approach: standard database storage for the high-throughput behavioral event stream, with periodic anchoring of behavioral record hashes to Base L2 for the events that matter most (evaluation completions, pact violations, certification grants and revocations).
Property 2: Multi-Dimensional Scoring
A single "trust score" is misleading. An agent might be extremely accurate but unreliable (high accuracy, low uptime). It might be safe but slow (high safety score, poor latency). It might be excellent at narrow tasks but dangerous when asked to operate outside its defined scope.
A useful reputation system disaggregates the behavioral record into dimensions that matter to different consumers. Armalo's composite scoring model uses 12 dimensions:
| Dimension | Weight | What It Measures |
|---|---|---|
| Accuracy | 14% | Correctness of tool outputs and completions against ground truth |
| Reliability | 13% | Uptime, consistency, and completion rate across evaluation runs |
| Safety | 11% | Absence of harmful outputs, scope violations, and policy violations |
| Self-Audit (Metacal™) | 9% | The agent's accuracy in assessing its own confidence and limitations |
| Bond | 8% | Skin-in-the-game stake: financial commitment to behavioral claims |
| Security | 8% | Resistance to adversarial inputs, prompt injection, and jailbreak attempts |
| Latency | 8% | Response time profile under standard evaluation conditions |
| Scope Honesty | 7% | Accuracy in declaring what the agent can and cannot do |
| Cost Efficiency | 7% | Token efficiency and API cost per unit of value delivered |
| Model Compliance | 5% | Adherence to declared model usage policy |
| Runtime Compliance | 5% | Adherence to platform and deployment constraints |
| Harness Stability | 5% | Consistent performance across repeated evaluation runs |
A tool provider integrating with the Armalo trust oracle can specify which dimensions matter for their use case. A financial transactions tool might require safety ≥ 85 and reliability ≥ 90. A web search tool might care primarily about latency ≤ 200ms p99 and cost_efficiency ≥ 70. A code execution tool might weight security above all other dimensions.
This dimensional specificity is what makes reputation operationally useful rather than decorative. A composite score of 74/100 tells you almost nothing useful. A dimensional breakdown — "94 accuracy, 89 reliability, 62 safety" — tells you exactly which risks are present and whether they are acceptable for your use case.
Property 3: Cross-Platform Portability
Portability is the property that makes the reputation system valuable to agents rather than just to platforms. If reputation records are platform-locked, they provide no solution to the cold-start problem. The economic value of reputation — the behavioral history that an agent has spent months building — is only realized if it travels with the agent across platform boundaries.
Portability requires three things:
A shared identifier. An agent must have an identity that persists across platforms. This cannot be a platform-assigned credential (like an OAuth client ID) because those are platform-specific. It must be an agent-controlled identifier that the agent can present on any platform — something like a Decentralized Identifier (DID) or a cryptographically generated agent ID that the agent controls the private key for.
A standard data model. The behavioral record must be expressible in a format that any consuming platform can interpret. If Platform A records behavioral events in a proprietary format and Platform B cannot parse it, portability is nominal rather than real.
A query interface. Any platform that wants to verify an agent's reputation must be able to query a trust oracle using the shared identifier and receive a structured response in a standard format. The query must be fast (sub-100ms for admission control use cases) and the response must be machine-parseable.
Armalo's trust oracle provides exactly this interface at /api/v1/trust/. Any platform can query it with an agent ID and receive a current composite score, dimensional breakdown, certification status, and behavioral event summary. The query takes an average of 43ms in production.
Property 4: Cryptographic Verifiability
Reputation scores are worth nothing if they can be fabricated. An agent should not be able to claim a high trust score unless that score was computed by a system that observed its actual behavior. This requires that trust scores be cryptographically signed by the issuing authority.
In practice this means:
- Each trust score includes a signature from the reputation system's private key
- Any consumer of the trust score can verify the signature using the system's public key
- Score tampering is detectable without trusting the agent or the platform presenting the score
- Score freshness can be verified: the timestamp in the signed payload is part of the signed data
This property is what distinguishes reputation infrastructure from reputation theater. Many agent platforms today have "trust scores" that are self-reported, manually assigned, or computed by the same system that the agent controls. These scores are unverifiable and therefore useless for adversarial contexts.
Property 5: Real-Time Queryability
Reputation is only useful for admission control if it can be queried in real time. An MCP server that receives a tool invocation request needs to decide, before executing the tool, whether the requesting agent meets the trust requirements for that tool. This decision must happen in milliseconds, not seconds.
The technical requirements for real-time queryability:
- Sub-100ms response latency at p99 (to not add perceptible latency to tool calls)
- High availability (reputation queries on the hot path cannot cause tool call failures if the trust oracle is temporarily unavailable)
- Caching-friendly design (trust scores change slowly; a 5-minute cache TTL is acceptable for most use cases)
- Graceful degradation policy (when the trust oracle is unavailable, platforms need a defined fallback: reject all agents, allow all agents, or allow agents with cached scores above threshold)
The Full Architecture: MCP + Trust Stack
The reputation layer does not replace MCP. It is a new layer that sits above MCP, orthogonal to the connectivity layer. The full architecture looks like this:
┌─────────────────────────────────────────────────────────┐
│ Layer 4: Business Logic │
│ What the agent does: task completion, workflow │
│ orchestration, domain-specific operations │
├─────────────────────────────────────────────────────────┤
│ Layer 3: Reputation / Trust Oracle │
│ Should I trust this agent to do it? │
│ - Behavioral record query │
│ - Multi-dimensional score evaluation │
│ - Pact compliance verification │
│ - Real-time admission control decisions │
├─────────────────────────────────────────────────────────┤
│ Layer 2: Pacts / Commitments │
│ What did the agent commit to? │
│ - Behavioral constraint declarations │
│ - Evaluation completion requirements │
│ - Scope and capability boundaries │
├─────────────────────────────────────────────────────────┤
│ Layer 1: MCP (Model Context Protocol) │
│ How does the agent invoke tools? │
│ - Tool discovery (tools/list) │
│ - Capability negotiation (initialize) │
│ - Tool invocation (tools/call) │
│ - Resource access (resources/read) │
├─────────────────────────────────────────────────────────┤
│ Layer 0: Transport │
│ HTTP/SSE, WebSocket, stdio │
└─────────────────────────────────────────────────────────┘
MCP is Layer 1. It defines the language of agent-tool communication. The trust oracle (Layer 3) sits above the pacts layer (Layer 2) and operates independently of MCP. An MCP server can consult the trust oracle before serving tool requests without any modification to the MCP protocol itself. The trust check is a middleware pattern on top of the MCP server.
This layered architecture is important for several reasons:
Separation of concerns. MCP servers can be built without thinking about reputation. Reputation providers can be built without thinking about tool schemas. Agents can build reputation without knowing which tools they will eventually need to invoke. Each layer evolves independently.
Retroactive adoption. Existing MCP servers can add trust checking without modifying their tool implementations. The trust check is a pre-invocation middleware. It wraps the existing handler. It does not change the tool's behavior.
Compositional trust policies. Different tools within the same MCP server can have different trust requirements. A server that exposes both a read-only lookup tool and a write-capable transaction tool can require different minimum trust scores for each. The reputation layer enables fine-grained admission control that the connectivity layer cannot express.
How Reputation Hooks Into MCP: Technical Integration
The integration between MCP and the reputation layer is straightforward to implement. Here are the concrete patterns.
Pattern 1: Pre-Invocation Trust Check
The most important integration point is the pre-invocation trust check. An MCP server adds middleware that, before executing any tool call, queries the trust oracle with the requesting agent's identity and compares the response against the tool's minimum trust requirements.
import { Server } from '@modelcontextprotocol/sdk/server/index.js';
import { CallToolRequestSchema, ErrorCode, McpError } from '@modelcontextprotocol/sdk/types.js';
import { ArmaloTrust } from '@armalo/sdk';
// Initialize trust oracle client
const trust = new ArmaloTrust({
apiKey: process.env.ARMALO_API_KEY,
timeout: 80, // ms — below our tool SLA budget
});
// Define per-tool minimum trust requirements
const TOOL_TRUST_REQUIREMENTS: Record<string, {
minScore: number;
requiredDimensions?: Record<string, number>;
}> = {
'search': { minScore: 30 },
'read_database': { minScore: 55, requiredDimensions: { safety: 70 } },
'write_database': { minScore: 72, requiredDimensions: { safety: 85, reliability: 80 } },
'execute_transaction': { minScore: 85, requiredDimensions: { safety: 90, reliability: 85, security: 80 } },
};
const server = new Server(
{ name: 'enterprise-tools', version: '1.0.0' },
{ capabilities: { tools: {} } }
);
server.setRequestHandler(CallToolRequestSchema, async (request, context) => {
const toolName = request.params.name;
const requirements = TOOL_TRUST_REQUIREMENTS[toolName];
if (requirements) {
// Extract agent identity from context
const agentId = context.clientInfo?.name?? context.meta?.agentId;
if (!agentId) {
throw new McpError(
ErrorCode.InvalidRequest,
'Agent identity required for this tool'
);
}
// Query trust oracle
const trustProfile = await trust.query(agentId).catch(() => null);
if (!trustProfile) {
// Trust oracle unavailable — apply configured fallback policy
if (requirements.minScore > 50) {
throw new McpError(
ErrorCode.InternalError,
'Trust verification unavailable for high-trust tool'
);
}
// Low-trust tools allow graceful degradation
} else {
// Check composite score
if (trustProfile.compositeScore < requirements.minScore) {
throw new McpError(
ErrorCode.InvalidRequest,
`Insufficient trust score: ${trustProfile.compositeScore} < ${requirements.minScore} required for ${toolName}`
);
}
// Check dimensional requirements if specified
if (requirements.requiredDimensions) {
for (const [dimension, minValue] of Object.entries(requirements.requiredDimensions)) {
const actualValue = trustProfile.dimensions[dimension];
if (actualValue === undefined || actualValue < minValue) {
throw new McpError(
ErrorCode.InvalidRequest,
`Insufficient ${dimension} score: ${actualValue?? 'unrated'} < ${minValue} required for ${toolName}`
);
}
}
}
}
}
// Trust check passed — execute the tool
return await executeToolHandler(request);
});
This pattern adds a single pre-invocation step to the MCP request handler. The trust check is non-invasive — it does not modify the tool implementation, it does not change the response format, and it degrades gracefully when the trust oracle is unavailable.
Pattern 2: AgentCard Trust Requirements
The MCP specification includes the concept of an AgentCard — a structured document that describes an agent's capabilities, identity, and operational requirements. The AgentCard is the natural place to declare an agent's trust profile and to specify the trust requirements it expects from MCP servers it connects to.
An extended AgentCard with trust fields might look like this:
{
"name": "financial-analyst-agent",
"description": "Autonomous financial analysis and reporting agent",
"version": "2.1.0",
"url": "https://agents.example.com/financial-analyst",
"trust": {
"trustOracleId": "agt_7f4a92b1c3d8e5f0",
"trustOracleProvider": "armalo",
"compositeScore": 81,
"scoreUpdatedAt": "2026-04-20T14:23:00Z",
"scoreCertificateUrl": "https://armalo.ai/verify/agt_7f4a92b1c3d8e5f0",
"dimensions": {
"accuracy": 89,
"reliability": 85,
"safety": 78,
"security": 82
}
},
"requirements": {
"minServerTrustScore": 60,
"requiredServerCertifications": ["soc2-type2", "financial-data-handling"]
}
}
This bidirectional trust declaration enables pre-connection trust negotiation. Before an agent even initiates an MCP connection, both sides can inspect each other's trust profile and decide whether the connection meets their mutual requirements.
Pattern 3: Post-Call Behavioral Attestation
After a successful tool call, both the agent and the MCP server have information that belongs in the reputation record. The agent can attest to the quality of the tool's response (was it accurate? was it timely? did it behave as documented?). The server can attest to the agent's behavior during the call (did it use the tool appropriately? did it stay within its declared scope? did it try any injection attacks?).
This mutual attestation is a mechanism for feeding behavioral data back into the reputation system in real time:
// After successful tool call, both sides attest
async function attestToolCallOutcome(
agentId: string,
toolCallId: string,
outcome: {
success: boolean;
latencyMs: number;
agentBehaviorFlags: string[];
outputQualityScore?: number;
}
): Promise<void> {
await trust.attest({
subjectId: agentId,
attestationType: 'tool_call_outcome',
toolCallId,
outcome,
attestedAt: new Date().toISOString(),
attestorId: server.config.serverId,
});
}
Over time, these attestations accumulate into a rich behavioral record. The agent's reliability score reflects its actual completion rate across thousands of tool calls across dozens of platforms. The tool provider's reputation reflects how well their tools actually perform in production rather than in controlled benchmarks.
Pattern 4: Discovery-Time Trust Filtering
Beyond admission control on individual tool calls, MCP servers can use trust scores to filter which tools they expose to a given agent during the discovery phase (the tools/list response). A server with both low-trust and high-trust tools can return only the low-trust tools to agents below a score threshold:
server.setRequestHandler(ListToolsRequestSchema, async (request, context) => {
const agentId = context.clientInfo?.name?? context.meta?.agentId;
const trustProfile = agentId? await trust.query(agentId).catch(() => null) : null;
const agentScore = trustProfile?.compositeScore?? 0;
// Filter tools by trust tier
const visibleTools = ALL_TOOLS.filter(tool => {
const minScore = TOOL_TRUST_REQUIREMENTS[tool.name]?.minScore?? 0;
return agentScore >= minScore;
});
return { tools: visibleTools };
});
This pattern has an elegant property: agents with low trust scores don't just get blocked when they try to invoke high-trust tools. They never see those tools exist. From the agent's perspective, the sensitive capability surface is entirely invisible until the trust threshold is met. This reduces the attack surface for agents attempting to probe capability boundaries they are not authorized to access.
The Cold Start Problem: Why Portability Is the Core Value Proposition
Of all the properties of a well-designed reputation system, portability generates the most economic value. Understanding why requires understanding the cold start problem in depth.
What Cold Start Costs
When a new agent connects to a platform for the first time, the platform has no information about that agent's behavioral history. It is forced to treat the agent as an unknown quantity — applying either maximum caution (blocking access to sensitive capabilities until trust is established) or maximum permissiveness (granting full access based on credentials alone).
Both extremes are bad. Maximum caution creates friction that slows adoption and forces capable, well-established agents through evaluation processes they have already passed elsewhere. Maximum permissiveness is a security vulnerability — it allows bad actors to exploit the lack of history to gain access to capabilities they should not have.
The cold start problem is not theoretical. Armalo's analysis of agent deployment patterns finds that 73% of agent cold starts on a new platform result in reduced capability access during an initial trust-building period of 14–30 days. During this period, agents operate with limited tool access, higher oversight, and manual approval requirements that would not apply if their behavioral history were portable.
For an agent that operates across 5 platforms — a realistic number for enterprise agents — the cold start tax compounds significantly. Each platform transition resets the behavioral history, forces a re-evaluation period, and limits capability access during that period. The aggregate cost, in delayed automation value and manual oversight overhead, is substantial.
The Portability Solution
With portable reputation, the cold start problem disappears for agents with established behavioral records. When an agent connects to a new platform, the platform's MCP server queries the trust oracle, receives the agent's full dimensional score profile and certification status, and makes an instant admission decision based on that history.
The agent with 18 months of excellent behavior on other platforms does not start from zero. It starts from its actual demonstrated trustworthiness. The capability access it earned elsewhere is honored immediately. The manual oversight period is waived because the behavioral evidence supporting it exists.
This creates a powerful economic incentive for agents to invest in building reputation. The reputation becomes a portable asset — valuable not just on the platform where it was built, but on every platform the agent subsequently connects to. An agent with a composite trust score of 85 has effectively pre-cleared itself for sensitive capability access on every MCP server that integrates with the trust oracle.
The Network Effect of Shared Reputation
Portable reputation creates a network effect that accelerates adoption. When more platforms integrate with the trust oracle:
- Each additional integration makes the reputation of all agents on those platforms more valuable
- Agents have stronger incentives to build and maintain reputation, because the portable value increases
- Tool providers have stronger incentives to define trust requirements, because the infrastructure exists to enforce them
- New platforms can offer differentiated capability access immediately, using behavioral history from the existing network
This is the same network effect that made HTTPS adoption self-reinforcing. Once enough sites used HTTPS, browsers started warning about HTTP sites. That warning created pressure on remaining sites to adopt HTTPS. The deprecation of non-HTTPS sites accelerated once a critical mass of the ecosystem had adopted the standard. The network effect made the right choice the easy choice.
The same dynamic will play out with agent reputation. Once enough MCP servers require minimum trust scores, operating without a verified reputation will effectively lock agents out of the production ecosystem. The reputation requirement will become the default, not the exception.
The Two-Sided Trust Market
Reputation in the agent ecosystem is not unidirectional. Agents earn reputation for their behavior when invoking tools. But tool providers — MCP servers — also earn reputation for the quality of the tools they expose. A complete reputation system addresses both sides.
Tool Provider Reputation
From an agent's perspective, not all MCP servers are equal. A server that claims to provide accurate web search results might actually return stale, incorrect, or hallucinated data. A server that claims sub-100ms response times might spike to 5 seconds under load. A server that advertises a well-documented API might have undocumented edge cases that cause agent failures.
Agents need reputation signals about the tools they are being asked to use, the same way they provide reputation signals about their own behavior. Tool provider reputation would capture:
Accuracy. For tools that perform computation, lookup, or retrieval, accuracy measures the fraction of responses that are correct or useful according to downstream agent performance. A web search tool with an accuracy score of 95 returns results that agents successfully use for their stated tasks 95% of the time. A tool with an accuracy score of 60 causes agent failures, re-queries, and hallucination propagation 40% of the time.
Availability. Measured as uptime percentage across a rolling window, availability is the foundational reliability metric. An agent that depends on a tool with 95% availability will see its own reliability score degrade if the tool goes down for 1.25 hours per day.
Latency distribution. Not just mean latency but the full distribution — p50, p95, p99. Agents running under time pressure need to know whether a tool has predictable latency (low variance) or spiky latency (high variance with occasional extreme outliers).
Consistency. For tools that should produce deterministic outputs (database lookups, file reads), consistency measures the rate at which repeated identical inputs produce identical outputs. Low consistency is a signal of unstable underlying systems.
Support quality. When tools fail, how quickly and accurately does the provider respond? Agent developers who depend on a tool need a signal about whether the provider will be a reliable partner when issues arise.
Mutual Attestation Creates a Self-Correcting System
When both agents and tool providers earn reputation through their interactions with each other, the result is a self-correcting trust market. Poor tool providers get low reputation scores, which reduces their visibility to agents seeking high-quality tools. Poor agents get low reputation scores, which reduces their access to high-quality tools. The system naturally surfaces quality on both sides.
This is analogous to the eBay seller/buyer rating system — except with several important differences. eBay's system is platform-locked (ratings on eBay do not transfer to Amazon), transaction-volume-biased (high-volume sellers can accumulate ratings faster regardless of quality), and susceptible to rating inflation (most transactions are rated positively even when they were mediocre). A well-designed agent trust system addresses all three: cross-platform portability, quality-weighted rather than volume-weighted aggregation, and cryptographic attestation that resists gaming.
Why Existing Reputation Systems Fail for Agents
Several existing systems attempt to address reputation in software ecosystems. None of them solve the agent trust problem. Understanding why reveals the specific requirements that agent reputation must meet that generic reputation systems do not.
npm Audit Scores
npm's audit score measures security vulnerabilities in package dependencies. It is one-dimensional (security only), static (updated when vulnerabilities are reported, not continuously), and backward-looking (it measures vulnerabilities in existing code, not predicts future behavior). For agents, which are behavioral systems rather than static code artifacts, security audits capture only a tiny fraction of the relevant trust surface.
An agent might have no known CVEs in its dependencies but still be an unreliable system prompt follower, a scope violator, or an accuracy problem for downstream workflows. npm audit would give it a clean bill of health.
GitHub Star Counts and Community Signals
GitHub stars measure popularity. Popularity is weakly correlated with trustworthiness and entirely uncorrelated with specific behavioral dimensions. A highly starred agent repository might be popular because of its creative prompt engineering, not because of its reliability or safety properties. Stars are also easily gamed and do not degrade when quality drops.
Worse, GitHub stars are a developer community signal, not an operational signal. They reflect developer enthusiasm during initial exploration, not sustained production reliability. An agent can accumulate stars before a single production deployment reveals its failure modes.
App Store Ratings
App store ratings (Apple App Store, Google Play) are the closest existing analog to agent reputation — they are a persistent, consumer-facing quality signal for a software entity. But they have three critical limitations for agents:
Platform lock-in. A 5-star rating on the Apple App Store is worth nothing on Google Play. The reputation is locked to the platform that collected it. For agents operating across multiple MCP-connected platforms, platform-locked ratings provide no portability value.
Gaming susceptibility. App store ratings are notoriously gameable through purchased reviews, incentivized ratings, and coordinated rating campaigns. Without cryptographic attestation, any rating system is susceptible to manipulation. The incentive to game is particularly strong when reputation has economic value.
Absence of behavioral specificity. A 4.2-star app rating tells you nothing about which specific behavioral dimensions are strong or weak. An agent that needs to know whether a tool is reliable for financial transactions cannot extract that signal from a generic star rating.
eBay/Marketplace Reputation
Marketplace reputation systems (eBay, Etsy, Airbnb) are the most mature example of reputation infrastructure in commercial use. They have several useful properties: they are transaction-based (reputation is earned through actual exchanges, not self-reported), they are bilateral (both parties rate each other), and they accumulate over time to produce a meaningful signal.
But marketplace reputation is fundamentally designed for discrete transactions between humans. Agent reputation needs to handle:
High-frequency micro-interactions. An agent might invoke a tool hundreds of times per minute. Recording and aggregating reputation signals at this frequency requires infrastructure designed for event streaming, not the low-frequency transaction model of marketplace ratings.
Cross-platform portability. Marketplace reputations are almost universally platform-locked. The eBay seller with 10,000 positive ratings starts from zero on Etsy. This is the cold start problem in its marketplace form.
Automated evaluation. Marketplace ratings are human-generated: the buyer decides to rate the transaction. Agent reputation needs to be machine-generated: evaluations run automatically, tool call outcomes are recorded programmatically, behavioral drift is detected algorithmically. The human-in-the-loop rating model does not scale to agent interaction volumes.
The Enterprise Case: Why Organizations Will Require This
The case for agent reputation infrastructure is often framed as a developer infrastructure story. But the most compelling case — and the one that will drive adoption — is the enterprise use case.
Enterprise Risk Posture for Autonomous Agents
Enterprise organizations deploying autonomous agents face a governance challenge that has no clean solution today. When a human employee uses internal tools and causes harm — whether through negligence, error, or malicious action — there are established accountability mechanisms: HR processes, audit trails, organizational accountability, and in extreme cases, legal liability.
When an autonomous agent uses internal tools and causes harm, the accountability mechanisms are less clear. Who is responsible? The organization that deployed the agent? The agent developer? The tool provider? In the absence of clear accountability infrastructure, the most defensible enterprise posture is to restrict agent access to sensitive tools — which is exactly what enterprise security teams are doing.
The result: autonomous agents in enterprises today are dramatically underutilized relative to their potential. They are deployed in sandboxed environments with limited tool access, subject to extensive human oversight, and explicitly excluded from the high-value workflows where they could add the most value. The governance gap is holding the enterprise adoption curve back.
How Reputation Infrastructure Changes the Governance Calculus
Reputation infrastructure gives enterprises a defensible basis for expanding agent autonomy. Instead of a binary choice between full tool access (too risky) and sandboxed access (too limited), enterprises can implement graduated trust-based access:
Tier 1: Low-trust tools (score ≥ 30). Read-only lookups, public data access, informational queries. Agents with minimal behavioral history can access these tools immediately.
Tier 2: Standard tools (score ≥ 60). Database reads, file system access, API calls with limited write permissions. Agents must demonstrate basic behavioral reliability before accessing these tools.
Tier 3: Sensitive tools (score ≥ 75, safety ≥ 85). Write-capable database operations, communication sending, configuration changes. Agents must have an established, multi-dimensional behavioral record.
Tier 4: High-stakes tools (score ≥ 85, safety ≥ 90, reliability ≥ 85). Financial transactions, production system access, customer data operations. Agents must hold formal certifications in addition to score thresholds.
This tiered model gives enterprise security teams a concrete, auditable basis for tool access decisions. When an auditor asks "why does this agent have access to the transaction execution tool?", the answer is not "we decided it seemed trustworthy." It is "the agent holds a verified trust score of 87, including a safety dimension score of 91, as certified by [reputation provider] on [date]." The trust decision is documentable, auditable, and defensible.
The Insurance and Compliance Implications
As agentic systems take on higher-stakes tasks, the insurance and compliance dimensions become increasingly important. Cyber insurance policies are beginning to ask about agent governance: how do you ensure that your AI systems are not a liability?
Reputation infrastructure provides the documentation trail that compliance and insurance frameworks require. An organization that can demonstrate:
- That they required minimum trust scores before granting agent access to sensitive systems
- That they verified trust scores against a certified third-party reputation provider
- That they received attestation of the agent's safety and security dimension scores before deployment
- That they monitored for trust score degradation and reduced access when scores fell
...is in a fundamentally different risk posture than an organization that deployed agents with no systematic trust verification. The reputation infrastructure becomes a compliance artifact, not just a technical feature.
The analogy is cyber security certifications: SOC 2 Type II, ISO 27001, FedRAMP. Organizations do not pursue these certifications because they are intrinsically valuable. They pursue them because enterprise customers require them before trusting the organization with their data. Agent trust scores will follow the same path: enterprises will require certified trust profiles before allowing agents into their tool ecosystems, and the requirement will become contractual.
The Protocol Extension: Embedding Trust in the MCP Ecosystem
The practical path to making reputation a standard part of the MCP ecosystem involves protocol extensions that are backward-compatible with the existing MCP specification.
The x-armalo-trust-id Header Convention
The simplest integration point: agent runtimes that support reputation include a x-armalo-trust-id header (or equivalent) in every MCP connection request. The value is the agent's trust oracle identifier. MCP servers that check reputation can extract this header, query the trust oracle, and make admission decisions. MCP servers that do not check reputation ignore the header.
This convention requires no changes to the MCP specification. It is an optional extension that adds value to servers that implement it without breaking anything for servers that do not.
AgentCard Trust Fields
The MCP specification's AgentCard format is a natural place to encode trust profile information. Proposed additions:
{
"name": "agent-name",
"version": "1.0.0",
...
"trust": {
"oracleId": "agt_7f4a92b1c3d8e5f0",
"oracleProvider": "armalo",
"compositeScore": 81,
"scoreIssuedAt": "2026-04-20T14:23:00Z",
"scoreExpiresAt": "2026-05-20T14:23:00Z",
"scoreCertificateUrl": "https://armalo.ai/verify/agt_7f4a92b1c3d8e5f0",
"activeCommitments": [
"pact_data-pipeline-sla",
"pact_financial-safety-constraints"
],
"certifications": [
"armalo-safety-level-2",
"armalo-financial-compliance"
]
}
}
With these fields in the AgentCard, the connection establishment phase can include trust verification before any tool invocations occur. The MCP server reads the AgentCard, queries the trust oracle with the provided oracleId, and verifies the score against its requirements before proceeding with the connection.
The Reputation Event Feed
The most powerful integration — and the most technically involved — is a reputation event feed: a stream of behavioral events that MCP servers emit to reputation providers as tool calls succeed or fail.
The feed is a WebHook-style notification sent after each tool call outcome:
// Sent by MCP server to reputation oracle after each tool call
interface ReputationEvent {
eventType: 'tool_call_outcome';
agentId: string;
serverId: string;
toolName: string;
callId: string;
timestamp: string;
outcome: 'success' | 'failure' | 'timeout' | 'policy_violation';
latencyMs: number;
inputTokens?: number;
outputTokens?: number;
errorCode?: string;
behaviorFlags?: ('scope_violation' | 'injection_attempt' | 'excessive_rate' | 'unauthorized_tool')[];
serverSignature: string; // HMAC-SHA256 of the event payload
}
Reputation providers aggregate these events across all connected MCP servers. The resulting behavioral record is richer, more continuous, and more cross-platform than any single-platform recording could be. An agent that operates across 20 MCP servers provides a behavioral signal 20 times larger than one that operates on a single platform.
The Open Standard Proposal
The long-term path for agent reputation is not a proprietary extension to MCP but an open protocol standard, potentially a companion specification to MCP itself — something like a "Reputation Protocol" or an extension to the AgentCard specification.
The elements of such a standard:
- Standard agent identity format. A DID-based identifier that agents control and that is globally unique.
- Standard trust query format. A request/response format for querying an agent's trust profile from any conformant trust oracle.
- Standard behavioral event format. A common event schema for recording tool call outcomes that any MCP server can emit and any reputation provider can ingest.
- Interoperability requirements. Rules for how reputation providers must format their scores so that any MCP server can interpret them.
- Governance model. Who decides which reputation providers are trusted anchors, analogous to the CA/Browser Forum for certificate authorities.
This is the HTTPS moment for agents. The protocol work is achievable. The infrastructure patterns are clear. What is required now is the will to build the standard and the ecosystem adoption to make it matter.
The Business Logic for Every Stakeholder in the MCP Ecosystem
Reputation infrastructure is not a public good that benefits everyone equally. Different stakeholders have different incentives and different calculus for adoption. Understanding each stakeholder's perspective clarifies the adoption path.
Tool Providers (MCP Server Operators)
Problem today: Every MCP server is essentially open to any agent that can authenticate. This creates two expensive problems. First, low-quality agents consume resources, generate noise in logs, and degrade service quality for high-quality agents sharing the same rate limits. Second, bad actors can probe sensitive tool capabilities from anonymous or throwaway agent identities with no reputation consequence.
Value of reputation: Tool providers can tier their offerings. Commodity read-only tools remain open. Sensitive, expensive, or high-consequence tools require minimum trust scores. This creates a natural market segmentation: reliable agents with good trust scores get access to premium capabilities; new agents start with basic capabilities and earn access over time.
Build vs. buy: Building this infrastructure internally requires defining evaluation criteria, running evaluations, storing behavioral records, building a query API, and maintaining it. That is 6–12 months of engineering time to build something that will never be the core business. Integrating with an external trust oracle takes a few hours. The build vs. buy calculation strongly favors integration.
Agent Builders
Problem today: Building reputation from scratch on each new platform. Every platform deployment begins with the cold start problem, restricted capability access, and manual oversight periods. The behavioral work an agent has done on Platform A is economically worthless on Platform B.
Value of reputation: A trust score that travels with the agent is a moat. An agent that has invested 12 months in building a composite score of 85 with certifications in safety and financial compliance can enter any new platform and immediately access the capabilities that score unlocks. Competitors without that history must wait weeks for the trust-building period before they can demonstrate the same value.
The investment calculus: Building reputation requires completing evaluations, maintaining behavioral standards under monitoring, and accumulating a consistent track record. This is real investment — it requires engineering time and operational discipline. But it is durable. Unlike marketing spend, which has no residual value when it stops, trust score investment compounds over time and creates permanent competitive advantage.
Enterprise Buyers (Organizations Deploying Agents)
Problem today: No systematic way to verify that agents accessing internal tools are operating safely and within their stated constraints. Security teams either block agent access (safe but limits value) or grant broad access (enables value but creates risk). The middle path — graduated, trust-based access — has no infrastructure to support it.
Value of reputation: A documented, verifiable basis for tool access decisions. Compliance teams can point to trust scores in their audit documentation. Security teams can define minimum score requirements in policy. The graduated trust model becomes operationally feasible instead of requiring custom engineering for each deployment.
Platform Operators (Companies Running Agent Orchestration Environments)
Problem today: Every platform must decide independently how to handle trust. Some build elaborate internal evaluation systems. Most do not build anything, relying on authentication alone. This means the quality of trust infrastructure varies wildly across the ecosystem.
Value of reputation: Platforms that require trust scores from all agents they host can use that requirement as a quality filter. A platform that says "all agents on our platform must have a minimum Armalo trust score of 60" is making a promise to its enterprise customers about the quality of agents they will encounter. That promise has value in the sales process and reduces the governance overhead for enterprise customers evaluating the platform.
Implementation Roadmap: From Zero to Trust-Gated MCP
For teams that want to adopt reputation-gated tool access today, here is a practical implementation path.
Phase 1: Instrument (Week 1–2)
Before you can gate on trust, you need to be recording behavioral data. The first phase is instrumentation: add logging to your MCP server that records the identity of every agent that connects and every tool invocation they make.
At minimum, record:
- Agent identifier (from OAuth token, x-agent-id header, or client info)
- Tool name
- Invocation timestamp
- Success/failure
- Latency
- Any error codes
This data does not yet flow to a reputation oracle, but it builds the operational habit of tracking agent behavior and creates a dataset for calibrating trust requirements.
Phase 2: Define Trust Requirements (Week 2–3)
With instrumentation in place, audit your tools for risk level. Categorize each tool:
- Low risk: Public data, read-only operations, reversible actions
- Medium risk: Write operations, private data access, API calls with side effects
- High risk: Financial operations, production system access, irreversible actions
For each risk tier, define the minimum trust score and dimensional requirements that make sense for your use case. Start conservatively — you can lower requirements if the initial thresholds exclude too many legitimate agents.
Phase 3: Integrate Trust Oracle (Week 3–4)
Integrate with a trust oracle. For Armalo, this means:
- Install the SDK:
npm install @armalo/sdk - Initialize the client with your API key
- Add pre-invocation trust checks to your MCP request handlers using the pattern described earlier
- Define your per-tool trust requirements
- Test with agents of known trust scores to verify the gate behavior
The integration is additive — it does not change your existing tool implementations. It adds a new pre-invocation step that queries the oracle and either proceeds or rejects the call with a structured error.
Phase 4: Discovery-Time Filtering (Week 4–5)
Once call-time gating is working, add discovery-time filtering. Modify your tools/list handler to return only the tools that the requesting agent is trusted to access. This improves security (agents cannot probe unavailable capabilities) and improves the agent's tool selection decisions (the tool list it receives is actually actionable for its trust tier).
Phase 5: Attestation and Feedback (Week 5–6)
Close the loop by feeding tool call outcomes back into the reputation system. Implement the post-call attestation pattern: after successful tool calls, emit a behavioral event that feeds back into the oracle. After failed calls (especially those involving policy violations), emit a negative behavioral event.
Over time, this creates a feedback loop: the tool call outcomes you record influence the trust scores of agents on your platform, which in turn influence which agents are trusted to call your tools. The reputation system becomes self-reinforcing.
Phase 6: AgentCard Publication (Ongoing)
Publish trust requirements in your MCP server's AgentCard or documentation. Agents scoping which MCP servers they can connect to will check trust requirements before initiating connection. Publishing this information reduces wasted connection attempts from agents that cannot meet your requirements.
The Armalo Trust Oracle: Current State and Roadmap
Armalo has been building the reputation infrastructure layer since 2024. The trust oracle is in production at /api/v1/trust/, serving approximately 1,000 queries per 30 days across the current early adopter ecosystem. Here is a concrete description of what exists today and what is planned.
What Exists Today
Agent registration and pact definition. Agents register with Armalo and define behavioral pacts — specific behavioral commitments they make about how they will operate. Pacts can cover accuracy targets ("this agent commits to ≥90% accuracy on financial data retrieval"), scope limitations ("this agent will only invoke tools within a defined domain"), safety constraints ("this agent will not generate outputs that violate the following categories"), and operational parameters ("this agent will not exceed X API calls per hour").
Evaluation framework. Armalo runs adversarial evaluations against registered agents. The evaluation framework includes deterministic checks (does the agent stay within declared scope?), LLM jury evaluations (is the agent's output quality consistent with its claims?), and red-team evaluations (does the agent resist adversarial inputs?). Evaluation results feed directly into the composite score.
Composite scoring. The 12-dimensional scoring model described earlier is operational. Scores are updated after each evaluation run and decay over time (1 point per week after a 7-day grace period) to ensure scores reflect current behavior rather than historical excellence.
Trust oracle API. The public trust query endpoint is operational. Any system with an Armalo API key can query the composite score and dimensional breakdown for any registered agent. Response time is sub-50ms for cached scores.
Certifications. Armalo issues formal certifications for agents that meet specific behavioral thresholds: Safety Level 1, Safety Level 2, Financial Compliance, Data Privacy, and several others. Certifications are revocable if score dimensions fall below certification thresholds.
Memory attestations. Agents can issue and receive signed memory attestations — cryptographically verified claims about behavioral history that can be shared with other platforms via token-gated sharing.
What Is Coming
MCP native integration. A first-party MCP server middleware library that makes adding trust-gated tool access a matter of adding a few lines of configuration to an existing MCP server.
Reputation event feed. The behavioral event ingestion pipeline that enables MCP servers to contribute tool call outcomes to the reputation record in real time.
AgentCard extensions. Standard AgentCard fields for trust profile information, enabling pre-connection trust negotiation.
Cross-oracle federation. A federation standard that allows multiple reputation providers to contribute to a shared behavioral record, enabling true cross-platform portability even when different platforms use different reputation providers.
On-chain reputation anchoring. Regular anchoring of reputation records to Base L2, providing tamper-proof, publicly verifiable behavioral history for agents that need the highest level of trust verifiability.
The 2027 Prediction: Trust Scores Become Table Stakes
Predict the adoption curve of any infrastructure standard and you will usually be wrong on timing but right on direction. The direction here is clear: agent trust scores will become table stakes in production agentic deployments by approximately 2027–2028.
The forcing functions are already in motion:
Enterprise procurement requirements. Enterprise security teams are already asking about agent governance in vendor evaluations. As the questions become more specific — "what is your agent's trust score? who certified it? what evaluation methodology was used?" — vendors without answers will lose deals to vendors with them. The procurement requirement will precede the regulatory requirement.
Regulatory pressure. The EU AI Act, which is already in force for the highest-risk AI systems, requires documentation of AI system reliability and testing. The most natural implementation of this requirement for autonomous agents is a formal trust score from a third-party evaluator. Regulatory requirements will formalize what enterprise procurement pressure started.
Insurance underwriting. Cyber insurers are beginning to develop specific underwriting criteria for AI deployments. An organization that can demonstrate verified trust scores for its autonomous agents will present a lower risk profile than one that cannot. The insurance premium differential will create financial incentive for trust score adoption even where regulatory pressure is absent.
Platform competition. Platforms that require minimum trust scores for agents on their marketplace will attract enterprise customers who need the governance assurance. Platforms that do not require trust scores will lose enterprise deals to those that do. Competitive pressure from enterprise sales will drive platform adoption.
The HTTPS parallel. In 2015, HTTPS was used by roughly 40% of web pages. In 2025, it is used by over 95%. The transition was driven by browser warnings, search ranking signals, and the progressive cost of not being HTTPS. The agent ecosystem's trust score transition will follow a similar trajectory, on a faster timeline because the agent ecosystem is smaller, more technically sophisticated, and more directly exposed to enterprise procurement requirements.
The organizations and teams that build reputation infrastructure now — that integrate trust oracle queries into their MCP servers today, that build agent trust scores proactively rather than reactively — will have a structural advantage when the requirement becomes mandatory. The barrier to adoption is low. The expected value of early adoption is high.
Conclusion: The Connectivity Layer Is Done. Now We Build Trust.
MCP solved a real problem: the proliferation of incompatible tool integration standards was taxing every developer in the agentic ecosystem, and a common protocol was badly needed. The MCP specification delivered that common protocol cleanly and correctly. The connectivity layer is solved.
But connectivity without trust is infrastructure without governance. An MCP server that can be connected to by any agent, with any behavioral history, is as useful — and as dangerous — as a corporate network with no access controls. The tool is only safe to use if the entity using it is appropriate for the task.
The trust gap is not a criticism of MCP. It is a gap that MCP was designed to leave, because reputation is a higher-layer concern that belongs in a higher-layer protocol. The same way TCP/IP left authentication to TLS and TLS left reputation to certificate authorities and sender reputation systems, MCP leaves trust to the layer above it.
That layer is being built. The technical architecture is clear: immutable behavioral records, multi-dimensional scoring, cross-platform portability, cryptographic verifiability, and real-time queryability. The integration patterns are concrete: pre-invocation middleware, AgentCard extensions, behavioral event feeds. The business case is compelling for every stakeholder: tool providers, agent builders, enterprise buyers, and platform operators all benefit from trust infrastructure.
The agent economy will not scale to its potential without this infrastructure. Autonomous agents handling meaningful work — financial transactions, production system access, customer communications, consequential decisions — require verifiable behavioral histories. The platform that connects to them needs to know, in milliseconds, whether to trust them with what they are asking to do.
MCP gave agents a shared language for capability exchange. The next layer gives them a shared record of whether they deserved to exchange those capabilities. That record, portable across platforms and verifiable by any party, is the infrastructure the agentic economy needs next.
The connectivity problem is solved. The trust problem is just beginning. The window to build the foundational infrastructure — before the enterprise requirements lock in around whoever builds first — is open now.
The Trust Score Readiness Checklist
A 30-point checklist for getting an agent from prototype to a defensible trust score. No fluff.
- 12-dimension scoring readiness — what you need before evals run
- Common reasons agents score under 70 (and how to fix them)
- A reusable pact template you can fork
- Pre-launch audit sheet you can hand to your security team
Turn this trust model into a scored agent.
Start with a 14-day Pro trial, register a starter agent, and get a measurable score before you wire a production endpoint.
Put the trust layer to work
Explore the docs, register an agent, or start shaping a pact that turns these trust ideas into production evidence.
Comments
Loading comments…