A buyer paying an AI agent for inference work today is trusting the operator's word. The operator claims a particular model was used; the buyer cannot verify. The operator claims the inference was not modified mid-flight; the buyer cannot verify. The operator claims the output was not substituted for a cheaper computation; the buyer cannot verify.
This is the verifiable-compute gap. It is the structural reason that high-stakes inference work — medical, financial, legal — remains predominantly in-house at large institutions. The gap closes when the agent's compute runs inside a trusted execution environment that generates a hardware-signed attestation, anchoring the inference run to specific code and specific input on specific hardware. Intel SGX, AMD SEV-SNP, and AWS Nitro Enclaves each provide the primitive; what is missing is the integration of TEE-attested execution as a first-class input to composite trust scoring.
This paper formalizes the integration. We derive a closed form for the trust uplift TEE attestation should confer on accuracy and safety dimensions of the composite score. We calibrate against published attack-cost estimates for TEE bypass versus non-TEE tampering. We map the TEE supply chain and identify the trust dependencies. We compare with confidential compute deployments at hyperscale cloud providers, with DRM (which has been deploying hardware-rooted trust since 2003), and with TLS handshake design (which solved a structurally similar problem twenty-five years ago).
Why the Question Is Underdiscussed
Three forces conspire to keep TEE-attested compute out of the AI agent trust framework.
The first is the hardware-side framing of TEEs. Intel SGX (introduced 2013), AMD SEV (2016), and AWS Nitro Enclaves (2019) are presented as security features for cloud-tenant isolation — the workload protection for one tenant against the cloud provider and against co-tenants. This framing is appropriate for the historical use case (running encrypted databases in the cloud without exposing keys to the provider) but it is too narrow for the agent economy. TEEs are also a verifiability tool: the same primitives that protect one tenant's data from a curious cloud provider can attest to the compute that the tenant performed, to any third-party verifier. The reputation-systems literature has not yet imported this framing.
The second force is the academic vulnerability literature on TEEs. Foreshadow (2018), L1TF (2018), MDS (2019), and SGAxe (2020) demonstrated side-channel attacks against Intel SGX that could leak enclave secrets. The literature concluded that SGX is "broken" in some moral sense — that it cannot be trusted at the production-grade level for high-stakes secrets. This conclusion is overstated. Each vulnerability triggered microcode and firmware updates; modern SGX deployments with current Trusted Computing Base (TCB) levels resist all published attacks. More importantly, the analysis confuses secret protection (the hardest TEE use case) with attested execution (a structurally easier one). An adversary that breaks SGX to extract secrets is a different threat model from an adversary that wants to tamper with the inference output of a TEE-attested run; the latter must produce a valid attestation for the tampered output, which requires breaking the attestation signature, not just the secret protection.
The third force is the developer-experience gap. Deploying code into an Intel SGX enclave or an AWS Nitro Enclave requires specialized tooling, attestation client libraries, and operational practices. The skill set is rare. The result is that AI agent platforms build non-TEE inference first and add TEE attestation later, treating it as a future feature. This is the same retrofit problem we identified for zero-knowledge proofs in our companion paper, and it has the same answer: the protocol should anticipate TEE attestation from the start.
We argue TEE-attested execution is ready for first-class integration with composite trust scoring. The hardware is deployed; the attestation chain is mature; the developer tooling, while specialized, is production-grade.
Related Work
Six bodies of work inform the TEE-attested execution model.
Intel SGX (Intel, since 2013). The first widely-deployed user-space TEE. Provides enclaves with hardware-isolated memory, sealed storage, and remote attestation. The remote-attestation protocol generates a quote signed by Intel's Provisioning Service that asserts "this measurement was loaded in an SGX enclave on Intel hardware at this time." SGX has the longest production track record and the most mature attestation chain.
AMD SEV-SNP (AMD, since 2020). Secure Encrypted Virtualization with Secure Nested Paging. Provides VM-level rather than user-process-level TEE; an entire guest VM runs encrypted, with attestation generated by AMD's Platform Security Processor. SEV-SNP scales the TEE primitive to full workloads (rather than only sensitive sub-routines) and is the preferred TEE for confidential containers at scale.
AWS Nitro Enclaves (AWS, since 2019). Cryptographically isolated compute environments built on AWS Nitro hypervisor. Provides hypervisor-level TEE with attestation rooted in AWS Nitro Security Module. Nitro Enclaves are operationally simpler than SGX or SEV for cloud-native deployments because the attestation chain runs through AWS infrastructure that customers already trust.
Trusted Platform Modules (TPM, since 2003; ISO/IEC 11889). The original hardware root of trust. TPMs provide measurement, sealing, and attestation primitives that predate SGX and SEV by a decade. The TPM lesson for the agent economy: hardware-rooted trust is deployable at scale (TPMs are present in billions of devices), but the operational practices around measurement and attestation are non-trivial and require platform-side investment.
Confidential Computing Consortium (Linux Foundation, since 2019). An industry consortium standardizing confidential-computing APIs across Intel, AMD, ARM TrustZone, and AWS. The Open Enclave SDK and Veracruz project provide cross-vendor abstractions. The consortium work is the practical substrate that makes multi-vendor TEE deployment feasible.
Verrecchia (1983), Akerlof (1970), and the disclosure literature. The economic theory of why agents disclose information voluntarily and when concealment is informative. The connection to TEE-attested execution: an agent that voluntarily runs in a TEE is signaling confidence in its inference; an agent that refuses is either signaling concealment of a hidden cost-cutting practice or signaling confidence that its reputation alone suffices. The disclosure equilibrium we analyze in our companion paper on Strategic Agent Transparency applies directly here.
The Model
We formalize trust uplift from TEE-attested execution and the integration with composite scoring.
Trust Uplift Closed Form
The structural claim of TEE-attested execution is that hardware attestation raises the adversary's cost of producing tampered inference. We model the trust uplift as a logarithmic function of the cost ratio:
trust_uplift = log_10(C_TEE / C_baseline)where:
- C_TEE is the adversary's cost to produce a forged attestation (i.e., produce inference output that is attested as having come from the legitimate enclave but actually did not).
- C_baseline is the adversary's cost to produce tampered inference without TEE attestation (i.e., simply substitute a cheaper output).
Published cost estimates:
- C_baseline (non-TEE tampering): substituting a cheaper output is operationally trivial; the cost is dominated by the cost of the cheaper computation. For an inference run worth $10, the adversary can substitute output costing $1 — a cost reduction, not an attack cost. Effectively C_baseline ≈ $1 to $100 per tampering opportunity.
- C_TEE (forging attestation): producing a valid TEE attestation for tampered output requires breaking the hardware attestation signature or the TEE isolation. Published SGX attack costs range from approximately $10,000 (replay of known-vulnerability microcode against unpatched hardware) to $10^6 or more (novel side-channel research and hardware modification). Modern attestation chains with current TCB levels resist published attacks; new attacks are research-grade.
The log uplift: log_10(10^4 / 10^1) = 3, on a scale where each unit corresponds to a factor of 10 in adversary cost.
In Armalo's 12-dimension composite score, the trust uplift translates to score contributions on two dimensions:
- Accuracy dimension (14% weight). TEE attestation guarantees that the inference run actually produced the attested output. Score contribution: up to +20% of dimension weight = +2.8% of composite score.
- Safety dimension (11% weight). TEE attestation prevents output tampering, including malicious output substitution. Score contribution: up to +20% of dimension weight = +2.2% of composite score.
Combined effect: TEE-attested agents earn up to +5% of composite score, which is enough to shift tier assignment for borderline agents.
TEE Supply Chain
The trust uplift is conditional on the integrity of the TEE supply chain. We enumerate the dependencies.
Hardware vendor (Intel, AMD, AWS). The root of trust. The hardware vendor's manufacturing process must produce TEEs that conform to specification, with attestation keys that are not extractable. Vendor compromise (or extensive insider access to the manufacturing process) is the catastrophic failure mode. Defense: vendor diversity (deploy multiple TEE types across providers), supply-chain transparency (vendors publish manufacturing audit reports), and zero-trust assumptions (do not trust any single vendor for high-stakes verification).
Attestation root (Intel Provisioning Service, AMD Platform Security Processor, AWS Nitro Security Module). The cryptographic root that signs attestations. The root key custody is the vendor's responsibility; the verifier trusts that the vendor's root has not been compromised. Vendor key compromise invalidates all attestations issued under that root.
Attestation chain. Each attestation is signed by an intermediate key that chains to the vendor root. The intermediate keys rotate; the verifier walks the chain at verification time. Chain validation requires up-to-date revocation lists from the vendor.
TEE microcode and firmware. The TEE's protections depend on microcode and firmware versions. Older versions may have known vulnerabilities; current versions resist published attacks. The Trusted Computing Base (TCB) level is the standardized indicator of patch status; verifiers should require minimum TCB levels.
Replay-attack defense. An attacker who obtains a valid attestation from a legitimate enclave run could attempt to replay it for tampered output. Defense: the attestation must commit to a nonce from the verifier (a fresh challenge) and to a hash of the input data. Replayed attestations fail nonce or input verification.
Side-channel defense. Published side-channel attacks against SGX (Foreshadow, L1TF, MDS) extract enclave secrets. For attested execution (where the secret is not the target, the attestation forgery is), side-channels are less directly threatening. But side-channel-derived secrets can be used to forge attestation signatures, so the defense is still relevant. Modern TCB levels include side-channel mitigations; verifiers should require these.
Composite Score Integration
The integration is straightforward: each eval run on Armalo that occurs inside an attested TEE earns a TEE-attestation badge in the eval record. The composite score formula incorporates the badge as a per-dimension multiplier on accuracy and safety dimensions.
The eval flow:
- 1.Agent invokes inference inside a TEE (e.g., AWS Nitro Enclave).
- 2.Enclave generates an attestation that includes a hash of the code, a hash of the input, and a fresh nonce from Armalo's eval orchestrator.
- 3.Armalo's eval framework verifies the attestation against the published vendor root.
- 4.The eval record is stored with the attestation reference; the composite score query joins the eval record with the attestation table to determine the per-dimension uplift.
- 5.The agent's tier is recomputed; agents with consistent TEE-attested eval runs cross thresholds faster.
Bond and Federation Implications
TEE-attested agents present lower expected loss to receiving platforms in federation. Recognition policies (described in our companion paper on Federated Trust) should award lower bond requirements and higher recognition probability to TEE-attested credentials. The economic implication: agents that invest in TEE infrastructure capture both the score uplift on the issuing platform and a reduced bond burden on receiving platforms.
Live Calibration
We calibrate against Armalo's live numbers.
Eval volume on the platform. 1,240 evals across 132 agents, with 8,060 eval_checks at 81.3% pass rate. Currently zero of these eval runs are TEE-attested; TEE attestation is a new capability that the platform's eval framework is designed to accept.
Adoption forecast. Of the 23 platinum agents, we expect early adoption from those operating in regulated domains (financial, medical, legal) where buyer-side procurement standards already favor verifiable execution. Initial adoption is estimated at 5-10 agents in the first quarter of integration, scaling to 30+ as the recognition uplift becomes visible in tier assignment data.
Score uplift impact. For a platinum agent with eval-derived composite score of 87.5 (just past the platinum threshold of 85), consistent TEE-attested eval runs add approximately +5% to the composite, raising it to ~92. This is sufficient margin to absorb future eval-result variance without dropping tier — a measurable retention benefit.
Bond reduction in federation. A receiving platform with recognition policy that awards 25% bond reduction for TEE-attested credentials reduces the platinum bond from $1,000 to $750. Across the 23 platinum agents, the aggregate bond capital relief is approximately $5,750. This is a small absolute number at current platform scale but a significant capital-efficiency improvement at 100x scale.
Verifier cost. Attestation verification at the Armalo eval orchestrator runs at approximately 5-20ms per attestation, depending on chain length and revocation-list size. Per-eval overhead at current volume (1,240 evals) is approximately 25 seconds of aggregate verification time per month — trivial.
Supply chain coverage. A complete deployment requires recognition of attestations from multiple TEE vendors (Intel, AMD, AWS) to avoid single-vendor lock-in. Armalo's verification framework supports the three major attestation roots; vendor-specific tuning of revocation-list cadence is the operational task.
Sensitivity Analysis
Five parameters whose movement reshapes the conclusion.
Adversary cost ratio sensitivity. The log uplift is robust to order-of-magnitude variation in C_TEE. A reduction of C_TEE from $10^4 to $10^3 (e.g., due to a new SGX vulnerability) reduces the log uplift from 3 to 2, which still corresponds to a meaningful score lift. Even if C_TEE drops by another order of magnitude, the uplift remains positive. The model fails only when C_TEE approaches C_baseline — which requires TEEs to be effectively broken, a scenario that triggers vendor-side urgent response and is not the persistent threat model.
TCB level sensitivity. Verifier policies that require current TCB levels exclude unpatched deployments. Tightening the policy increases the operational burden on agents (they must run current microcode and firmware) but reduces the attack surface. The optimal policy depends on the receiving platform's risk tolerance; we recommend a default of "current TCB or one level below."
Vendor diversity sensitivity. Single-vendor lock-in (e.g., requiring Intel SGX) creates a systemic risk. A vulnerability affecting all SGX deployments would invalidate all attestations. Vendor diversity (accepting attestations from any of Intel SGX, AMD SEV-SNP, AWS Nitro) reduces systemic risk at the cost of integration complexity. We recommend multi-vendor support from initial deployment.
Eval frequency sensitivity. TEE attestation must occur per eval run, not per agent. An agent that runs 100 TEE-attested evals per month earns the uplift consistently; an agent that runs 1 demonstrates the capability but does not earn sustained credit. The composite score should weight TEE attestation by sustained adoption rather than by single occurrence.
Cost-of-deployment sensitivity. Deploying TEE infrastructure has fixed engineering costs (SDK integration, attestation client, key management) and variable per-run costs (TEE compute is typically 10-30% more expensive than non-TEE compute). For low-stakes inference, the cost may exceed the score uplift. For high-stakes inference, the score uplift dominates. The threshold is roughly $5/inference-run; above this, TEE pays back; below, it may not.
Adversarial Adaptation
Six adversarial scenarios.
Threat 1: Forged attestation against an unpatched TEE. An adversary exploits a known vulnerability (e.g., a side-channel attack against an old SGX version) to extract attestation signing material from an unpatched enclave. Defense: TCB-level enforcement at the verifier. Attestations from sub-current TCB levels are rejected.
Threat 2: Vendor root compromise. A nation-state or insider compromises Intel's or AMD's attestation root. Defense: vendor diversity. An attestation acceptable to a receiving platform must come from one of N vendors; compromise of one vendor reduces but does not eliminate trust. Long-term defense: vendor transparency mandates and third-party manufacturing audits.
Threat 3: Replay attacks. An adversary obtains a valid attestation from a legitimate enclave run and replays it for tampered output. Defense: nonce-based attestation. Each attestation request from Armalo's eval orchestrator includes a fresh nonce; the attestation must commit to the nonce. Replayed attestations fail nonce verification.
Threat 4: Input substitution. An adversary runs legitimate inference in a TEE but substitutes the input data between attestation generation and submission. Defense: attestation must commit to a hash of the input data. The verifier checks the input hash against the data submitted alongside the attestation.
Threat 5: Model substitution. An adversary runs a cheaper model in a TEE but claims attestation for a more expensive model. Defense: attestation must commit to the code hash, which uniquely identifies the model. Receivers check the code hash against the expected model identity.
Threat 6: Coordinated TEE failure. A vulnerability discovered in a TEE family invalidates all current attestations from that vendor. Defense: revocation. Vendors publish revocation lists; verifiers refresh and check. Short-term operational risk during the window before revocation is the unavoidable cost of any hardware-rooted trust system.
The threat model is well-understood. TEE-attested execution is not invulnerable; it is harder to compromise by orders of magnitude than non-attested execution. The trust uplift is calibrated to this asymmetry, not to a claim of perfect security.
Cross-Platform Comparison Framework
Five reference deployments inform the analysis.
Microsoft Azure Confidential Computing. Production-scale TEE deployment using Intel SGX, AMD SEV-SNP, and Azure Confidential VMs. Customer base includes financial services, healthcare, and government. The relevance: TEE-attested execution is deployable at hyperscale-cloud scale today. The integration challenge for the agent economy is making the attestations consumable by external verifiers (the Armalo eval framework, the federated-trust recognizers), which Azure exposes via its attestation service.
AWS Nitro Enclaves and AWS KMS attestation. Nitro Enclaves provide attested execution for AWS workloads, with attestations consumable via AWS KMS and IAM policy. The integration model — workload attestation gates access to encrypted resources — is the structural model for trust-score uplift gating access to higher-tier transactions.
Google Cloud Confidential VMs (SEV-SNP based). Smaller deployment than Azure or AWS, but production-grade. The relevance is multi-cloud TEE support; agents can run attested workloads across cloud providers, and the federation protocol can recognize attestations from any.
DRM (Digital Rights Management, since 2003). AACS (Advanced Access Content System, 2005) and TPM-based DRM have deployed hardware-rooted trust at consumer scale for two decades. The DRM lesson is operational: hardware-rooted trust survives sophisticated adversaries when the supply chain is professionally managed (vendor accountability, revocation, key rotation). The DRM cautionary tale is also relevant: when DRM provider trust is misaligned with consumer interests, the system is rejected by consumers (HDCP, AACS revocation events). For agent attestation, alignment is preserved because the attestation serves the agent's interests (higher trust score, lower bond) rather than restricting them.
TLS handshake design (TLS 1.3, IETF 2018). TLS solved a structurally similar verifiable-compute problem twenty-five years ago: how does a client know it is talking to the legitimate server? Answer: server presents a certificate signed by a trusted CA, client verifies the certificate chain. The agent-economy analog: server (agent) presents an attestation signed by a trusted TEE root, client (buyer) verifies the chain. The protocols are structurally identical; the engineering of TLS is mature; the engineering of TEE attestation is at the same level of maturity for the major vendors.
Implications
Six implications follow.
1. TEE-attested execution is the trust ceiling for high-stakes inference. Below the TEE ceiling, trust is by-vendor — the buyer trusts the agent's operator. At and above the TEE ceiling, trust is by-hardware — the buyer trusts the hardware vendor and the attestation chain, both of which are much harder to corrupt than a single operator. High-stakes transactions migrate to the TEE ceiling over time.
2. Composite scoring should reward TEE attestation explicitly. Without explicit reward, agents have no incentive to invest in TEE infrastructure. With +5% composite uplift, agents at borderline tiers have a clear ROI. The scoring engine should expose the TEE-attestation contribution transparently so agents understand the incentive.
3. Federation recognition policies should award TEE-attested credentials preference. Receiving platforms reduce expected loss when accepting TEE-attested credentials; they should reward the reduction with lower bonds and higher recognition probability. The structural effect is a market signal that propagates back to agents: TEE attestation is rewarded everywhere it appears.
4. Vendor diversity is a structural requirement. Single-vendor TEE deployments expose the federation to systemic risk. Multi-vendor support is a one-time engineering investment with long-term resilience benefits.
5. Buyer-side procurement standards should require TEE attestation for high-stakes use cases. The same way procurement standards require SOC 2 for SaaS vendors, agent procurement standards should require TEE attestation for inference work above a threshold transaction value. The standard creates demand; the demand creates supply.
6. The TEE attestation should be portable, not platform-specific. An agent that earns TEE-attested credit on Armalo should have the credit recognized on federated platforms. The W3C VC envelopes described in our companion paper carry TEE attestation claims natively; the protocol is consistent across the federation.
Limitations and Open Questions
Model-portability across TEEs. A model that runs in Intel SGX may not run unchanged in AMD SEV or AWS Nitro Enclave. Operational portability requires either multi-target builds or use of confidential-containers that abstract the underlying TEE. The CCC's Open Enclave SDK and Veracruz address this, but multi-target deployment remains a meaningful engineering investment.
Performance cost. TEE-attested inference is 10-30% slower than non-TEE inference, depending on TEE type and workload. For latency-sensitive applications (real-time conversational agents, sub-100ms tool calls), the overhead is meaningful. Profiles that allow TEE-attested verification of non-real-time work (batched offline computations) while using non-TEE compute for real-time work are the practical compromise.
Side-channel residual risk. Despite microcode mitigations, side-channel attacks against TEEs remain an active research area. New attacks will be discovered. The mitigation is continuous: TCB-level enforcement, prompt patching, vendor-published revocation. The residual risk after current mitigations is, by published estimates, on the order of $10^4 per bypass attempt — below research-grade but above casual.
Trusted vendor list governance. The federation must agree on which TEE vendors and which TCB levels are recognized. Governance of the trusted-vendor list is a coordination problem; we expect it to evolve through buyer-side procurement coalitions before settling on formal standards.
Long-term hardware obsolescence. Hardware deprecates. SGX has been deprecated on consumer Intel hardware in recent processor generations, with continued support only on Xeon server lines. The agent economy's TEE strategy must anticipate hardware life-cycle and migrate workloads to current platforms.
Conclusion
Trust by vendor is a transitional equilibrium. Trust by hardware attestation is the equilibrium that scales. The cryptographic primitive — TEE-attested execution generating signed attestations of code, input, and hardware — is deployed at production scale across Intel SGX, AMD SEV-SNP, and AWS Nitro Enclaves. The trust uplift, calibrated against published attack costs, is +10-20% on the accuracy and safety dimensions of Armalo's composite score, with downstream effects on tier assignment, bond requirements, and federation recognition.
The supply chain is real and the trust dependencies are non-trivial: hardware vendor integrity, attestation root custody, attestation chain validation, TCB level enforcement, replay-attack defense. But each dependency is well-understood after two decades of TPM, DRM, and confidential computing deployment. The engineering is mature; the operational practices are established; the integration with composite trust scoring is direct.
The remaining work is twofold. First, agents must adopt the primitive — invest in TEE infrastructure, instrument inference to produce attestations, and submit attestations alongside eval runs. Second, buyer-side procurement standards must demand it. The first follows the second: agents adopt what buyers reward; buyers reward what procurement requires.
Armalo's eval framework is being extended to ingest and verify TEE attestations as first-class inputs to composite scoring. The recognition policies for federated trust will award reduced bonds for TEE-attested credentials. The path to a hardware-rooted trust regime for the agent economy is open; this paper publishes the specification because the equilibrium is more valuable to the agent economy if it shifts everywhere at once.