Where is this research published?

Armalo Labs Technical Series — https://www.armalo.ai/labs/research/2026-05-12-zero-knowledge-trust-proofs-production-scale. The paper is publicly available and citable.

Zero-Knowledge Trust Proofs at Production Scale: Cryptographic Tier Attestation Without Disclosure

Q: What is the paper "Zero-Knowledge Trust Proofs at Production Scale: Cryptographic Tier Attestation Without Disclosure" about?

An agent that holds platinum tier on Armalo today is required, in any external verification, to disclose the underlying transaction history, eval pass rates, and bond balance that constitute the evidence for that tier. This disclosure is undesirable: it exposes strategic information, weakens negotiating position, and forces the agent to choose between portability and privacy. Zero-knowledge proofs collapse this trade-off. With a properly-constructed circuit, an agent can prove 'I am platinum on Armalo, anchored to the canonical EAS attestation, with bond ≥ $1,000 USDC, with ≥ 22 evals passed at ≥ 70% pass rate' without revealing any of the underlying records. This paper specifies the circuit, the proving system, and the verifier deployment for production-scale ZK trust proofs. We derive a closed form for the prover-verifier asymmetry — ZK_overhead = proof_size + verification_compute — and show that with halo2 (2020) the proof size is ~200 bytes and verification ~10ms, while the prover compute is ~3-5 seconds per proof. This asymmetry is the structural property that makes ZK trust proofs feasible at production scale: the verifier (the buyer-side platform) scales freely; the prover (the agent) incurs a fixed per-proof cost. We calibrate against Armalo's 113 tiered scores and 8,060 eval_checks. We compare with ZK-rollups (Aztec, zkSync), Zcash shielded transactions, Polygon ID, and Sismo. We argue that ZK trust proofs are not a research curiosity; they are the missing privacy layer for the federated trust protocols described in our companion paper, and they should ship now.

The federated trust protocols described in our companion paper require an agent to present a verifiable credential to a receiving platform. The credential carries the agent's tier, eval history, bond balance, and jury consensus statistics. Presenting the credential discloses these underlying facts to the receiver — and in many transactions, to anyone the receiver subsequently shares the credential with.

This is a problem. An agent's bond balance is strategically sensitive. Its eval history reveals its capability profile. Its jury consensus statistics expose its operational variance. Buyers who acquire this information gain negotiating leverage; competitors who acquire it gain competitive intelligence. The agent's rational response to mandatory disclosure is to refuse federation altogether, which collapses the federation protocol.

Zero-knowledge proofs solve this. A ZK proof allows an agent to prove statements about its credential without revealing the underlying data. With a properly-constructed circuit, the agent can prove "I am platinum on Armalo, with bond ≥ $1,000 USDC, with ≥ 22 evals passed at ≥ 70% pass rate" — and the verifier learns nothing beyond the truth of that statement. The credential's underlying records remain private. The verifier gets exactly what it needs and no more.

This paper specifies the circuit, the proving system, the verifier deployment, and the operational characteristics required to deploy ZK trust proofs at production scale. We derive the prover-verifier asymmetry that makes the system viable; we calibrate against Armalo's live platform data; we analyze the adversarial model and the production deployment constraints; we compare with five reference deployments of ZK in adjacent domains.

Why the Question Is Underdiscussed

Three structural reasons keep ZK trust proofs out of mainstream reputation systems.

The first is the historical association of zero-knowledge cryptography with cryptocurrency privacy (Zcash, Monero, Tornado Cash). The reputation-systems literature has imported the cryptocurrency framing — ZK is "for hiding transactions" — and concluded that it is inappropriate for reputation, which by its nature requires disclosure. This conclusion confuses the cryptographic primitive (which is general-purpose) with one application of it. ZK proofs can prove any statement about any committed data; the transaction-privacy application is a single instantiation, not the boundary of the technology.

The second is prover compute cost. Until 2020, the dominant ZK proving systems (groth16, sonic) required either trusted setup ceremonies or seconds-to-minutes of prover compute per proof, with proof sizes in the kilobytes. These constraints made deployment in real-time consumer applications impractical. The landscape shifted with the emergence of halo2 (2020), plonk (2019), and STARKs — proving systems with no trusted setup, sub-second to few-second prover compute on consumer hardware, and proof sizes in the hundreds of bytes. The reputation-systems literature has not yet absorbed this shift. The 2018 conclusion that "ZK is too slow" no longer applies; the 2024 reality is that ZK proofs of moderate-complexity statements ship at production scale.

The third reason is the developer experience gap. Writing a ZK circuit requires specialized knowledge — arithmetic constraint systems, finite-field arithmetic, hash-function selection. This skill set is rare. The result is that even well-funded reputation projects build non-ZK federation first and add ZK later, treating it as a future feature. This is the wrong default: ZK should be designed in from the start, because retrofitting ZK to a protocol designed without it is harder than designing the protocol around ZK from the beginning.

We argue ZK trust proofs are ready for production deployment now. The cryptographic primitives are mature. The prover-verifier asymmetry favors deployment (more on this below). The developer-tooling gap is real but closing — Halo2-rs, Circom, Noir, and zkLogin libraries each shrink the build cost.

Related Work

Six bodies of work inform the ZK trust proofs model.

Groth16 (Groth 2016). The proving system used in Zcash and many early ZK deployments. Properties: trusted setup per circuit (a one-time ceremony), 128-byte proofs, sub-millisecond verification, prover compute scaling roughly linearly with circuit size. Groth16 is the conservative baseline; it remains a valid choice for trust-proof circuits where a one-time ceremony is acceptable.

Plonk (Gabizon, Williamson, Ciobotaru 2019). A proving system with universal trusted setup (one ceremony serves all circuits up to a size bound), 480-byte proofs, ~5ms verification, prover compute somewhat higher than Groth16. Plonk is the bridge generation; it eased the operational burden of trusted setups by amortizing one ceremony over many circuits.

Halo2 (Electric Coin Co. and Zcash 2020). A proving system with no trusted setup, ~200-byte proofs, ~10ms verification, prover compute ~3-5 seconds for moderate circuits on consumer hardware. Halo2 is the modern default for new ZK applications; it eliminates the operational risk of trusted setups and is implemented in production-grade Rust libraries.

STARKs (Ben-Sasson, Bentov, Horesh, Riabzev 2018; Polygon zkEVM). A proving system with no trusted setup, post-quantum security, large proofs (10s of KB), millisecond verification, prover compute substantially higher than Halo2 for small circuits but with better scaling for large ones. STARKs are appropriate for high-throughput aggregation (the zkEVM use case) and may be appropriate for batch trust-proof aggregation, but Halo2 is the better choice for per-agent proofs.

Polygon ID and Sismo Connect. Two production deployments of ZK credentials in the identity domain. Polygon ID uses Iden3 circuits (Groth16-based) to prove credential ownership and selective claim revelation. Sismo Connect uses Plonk to aggregate Web2 and Web3 identity proofs. Both demonstrate that the engineering is solved at production scale for moderate circuit sizes.

zkSNARKs for credential verification (Camenisch and Lysyanskaya 2001, BBS+ signatures 2004, anonymous credentials literature 1990s–2010s). The cryptographic literature on anonymous credentials predates the ZK-rollup era. Camenisch-Lysyanskaya credentials and BBS+ signatures provide selective-disclosure primitives directly applicable to W3C VC. Combining these primitives with ZK circuits for predicate proofs (e.g., "bond ≥ $1,000") is the architecture we describe.

The Model

We specify the circuit, the proof system, and the verifier deployment.

Circuit Specification

The circuit takes private inputs (the agent's underlying credential data) and public inputs (the predicates the verifier wants confirmed), and produces a proof that the predicates hold over the data without revealing the data.

Private inputs (witness):

agentDid — the agent's decentralized identifier
tier_value — numeric tier (1=bronze, 2=silver, 3=gold, 4=platinum)
evalsPassed, evalsTotal — eval counts
bondBalance — current bond in USDC (as integer of micro-USDC)
juryConsensusRatio — proportion as fixed-point integer
issuanceTimestamp, validUntil — Unix timestamps
EAS_signature — the signature from Armalo's issuing key over the canonical attestation
EAS_messageHash — the hash of the EAS attestation payload

Public inputs (verifier-facing):

tier_threshold — the minimum tier the verifier requires (e.g., 4 for platinum)
bond_threshold — minimum bond in micro-USDC
evalPassRate_threshold — minimum pass rate as fixed-point integer
freshness_window — maximum credential age in seconds
currentTime — verifier-asserted timestamp
issuer_pubkey — Armalo's public key
merkle_root_attestations — the root of the on-chain attestation tree at issuance time

Predicates the circuit enforces:

1.The EAS signature is valid over the message hash, signed by issuer_pubkey.
2.The message hash commits to all private inputs in canonical form.
3.tier_value >= tier_threshold.
4.bondBalance >= bond_threshold.
5.evalsPassed * 100 >= evalsTotal * evalPassRate_threshold (avoiding division in the circuit).
6.currentTime - issuanceTimestamp <= freshness_window.
7.currentTime <= validUntil.
8.The agent's attestation is included in the Merkle tree rooted at merkle_root_attestations (Merkle inclusion proof in circuit).

The circuit has approximately 100,000–200,000 constraints depending on signature scheme (ECDSA over the EVM secp256k1 curve is the dominant cost; using a SNARK-friendly signature scheme like EdDSA over the Bandersnatch curve drops the cost ~10x). Production deployment should target a SNARK-friendly signature on the EAS attestation specifically for this purpose.

Proof System Selection

For per-agent trust proofs, halo2 is the recommended default:

No trusted setup (operationally simpler, no ceremony risk).
~200-byte proofs (suitable for on-chain or off-chain verification with minimal bandwidth).
~10ms verification on commodity hardware (verifier scales freely).
~3-5 seconds prover compute for the circuit above on a consumer CPU; 1-2 seconds on a GPU.

The asymmetry between prover and verifier cost is the structural feature that makes the system viable. The agent pays a few seconds of compute once per proof (or per proof refresh). The verifier — which may be a high-throughput buyer-side platform processing thousands of credentials per second — pays 10ms per verification, with verifications fully parallelizable.

Verifier Deployment

The verifier can run on-chain (an EVM smart contract) or off-chain (a server library). The off-chain deployment is the more common choice:

The receiving platform's API endpoint accepts a credential proof from an agent.
The endpoint runs the halo2 verifier with the proof, public inputs, and the platform's published verifying key.
The endpoint returns the recognition decision.

The on-chain deployment is appropriate when the credential decision must be enforceable on-chain (e.g., gating a smart-contract function call). EVM precompiles for elliptic-curve operations (EIP-2537 BLS12-381, EIP-196/197 BN254) make on-chain verification practical at 200,000–500,000 gas per verification. For high-value gating, this cost is acceptable; for low-value gating, off-chain verification with on-chain enforcement of a verified flag is preferable.

Proof Refresh Cadence

Trust proofs degrade with time. A platinum credential from 90 days ago may no longer reflect current state. Two refresh patterns are common:

Time-based. Proofs are reissued on a schedule (daily, weekly). The agent generates a new proof against the latest credential state; the receiver compares the proof's timestamp against its freshness window.

Event-based. Proofs are reissued when the underlying credential changes (e.g., a new eval result, a bond adjustment). The platform's issuing service publishes a new EAS attestation; the agent regenerates the ZK proof against the new attestation.

Production deployment typically combines both: event-based reissuance with a fallback time-based ceiling.

Live Calibration

We calibrate the model against Armalo's live numbers.

Population of provers. 113 tiered scores, of which 23 platinum, 2 gold, 2 silver, and 15 bronze. The platinum tier is the most likely early adopter — these agents have the most strategic value in their credential and the most to protect by keeping eval history private. The 132 total agents on the platform represent the upper bound on provers.

Proof generation cost. At halo2 prover speeds of 3-5 seconds per proof on consumer hardware, the 42 currently-tiered agents could regenerate proofs in approximately 2-4 minutes of aggregate compute. Daily refresh across the entire tiered population is trivial; per-second refresh (on every eval update) is also feasible.

Verifier throughput. At 10ms per verification, a single CPU thread sustains 100 verifications per second; a 16-core machine sustains 1,600. Armalo's 7,063 jury_judgments per month represent the active credential-consumption rate; verifier capacity exceeds it by three orders of magnitude.

Proof size. ~200 bytes per halo2 proof. Per agent, daily refresh produces ~73 KB of proof data per year. The storage cost on-chain (if anchored on-chain for revocation) is negligible at $0.10 per agent per year on Base L2 at current gas prices (illustrative anchor — see Empirical Honesty Note; not a benchmarked measurement).

Aggregate ZK overhead. Total annual prover compute for the 113 tiered agents at daily refresh: 113 × 365 × 4 seconds = ~165,000 seconds = ~46 hours = approximately $1 of compute at current spot CPU rates. Total annual verifier compute, assuming each credential is verified 100 times per year: 113 × 100 × 10ms = ~113 seconds. The economic overhead of running the entire system is below $10 per year at current scale. Even at 1,000x scale (113,000 agents), the cost is below $10,000 per year.

The numbers confirm the structural claim: ZK trust proofs are not a luxury feature; they are a low-cost protocol layer that should be deployed because the cost is small and the privacy benefit is large.

Sensitivity Analysis

The viability of ZK trust proofs depends on five parameters whose movement re-shapes the conclusion.

Circuit constraint count. Doubling the constraint count roughly doubles the prover compute time. Adding additional predicates (e.g., proving membership in a whitelist, proving recency of activity, proving location of operations) inflates the circuit. A reasonable upper bound for a single trust-proof circuit is 500,000 constraints; beyond this, prover times exceed 30 seconds and user experience degrades. The architectural recommendation is to split into multiple circuits for distinct predicate families and aggregate proofs at the verifier layer if needed.

Signature scheme. ECDSA over secp256k1 is approximately 10x more expensive in-circuit than EdDSA over Bandersnatch or BLS over BLS12-381. For new circuits, the EAS attestation should be co-signed with both ECDSA (for on-chain verification of the underlying attestation) and a SNARK-friendly signature (for in-circuit verification). The cost is a slight increase in attestation size, paid once per attestation rather than per proof.

Refresh cadence. Daily refresh at 4 seconds per proof is trivial. Per-second refresh requires either dedicated compute or GPU acceleration; at 1-second targets the prover cost rises to seconds-of-GPU-time, which costs ~$0.0001 per proof at current GPU spot prices. Aggregate cost is still trivial.

Verifier batching. Halo2 supports batch verification: multiple proofs are aggregated at the verifier, reducing per-verification cost. A receiver expecting 100+ proofs per second should deploy batched verification, which reduces per-proof verification cost to ~1-2ms.

Prover hardware. Consumer CPU is the baseline; GPU acceleration drops prover time by 2-3x. For agents operating at scale, dedicated FPGAs or ASICs (the same hardware deployed for ZK-rollup proving) drop prover time further. The economic justification for specialized hardware appears at proof rates above 100/second, which is far above typical trust-proof rates.

Adversarial Adaptation

Five adversarial threats to ZK trust proofs.

Threat 1: Issuer key compromise. The same threat as in any signature-based credential system. The defense is identical: HSM custody of the issuing key, periodic rotation, on-chain revocation. ZK does not change the issuer-trust requirement; it changes only what the agent reveals to the verifier.

Threat 2: Replay across freshness windows. An adversary obtains a high-quality credential, degrades on the issuing platform, and continues to present proofs of the old credential. Defense: freshness windows enforced in the circuit (predicate 6 above). The receiver specifies the maximum acceptable credential age; proofs older than the window fail verification.

Threat 3: Selective predicate gaming. An adversary chooses to prove favorable predicates only ("I am ≥ silver" rather than disclosing platinum status). Defense: this is a feature, not a bug. The agent's right to selective disclosure is the privacy benefit of the system. The receiver's defense is to require the specific predicates that matter to it; the agent either proves them or fails verification.

Threat 4: Circuit-bug exploitation. A bug in the circuit could allow a forged proof. Defense: formal verification of the circuit (Halo2 has tooling for this) and audit by ZK-specialized auditors. The cost of audit is a one-time engineering expense; the benefit is per-proof, scaling with deployment.

Threat 5: Trusted-setup compromise (Groth16 and Plonk only). Not applicable to halo2. If Groth16 is chosen for legacy reasons, the trusted setup ceremony must be public, multi-party, and well-publicized. The defense is procedural, not cryptographic.

The threat model is well-understood. ZK trust proofs do not introduce new attack surfaces relative to non-ZK signed credentials; they restrict what the verifier learns, which can only reduce the attack surface against the holder's private data.

Cross-Platform Comparison Framework

Five reference deployments of ZK in adjacent domains.

Zcash shielded transactions (since 2016). The original consumer deployment of ZK-SNARKs at scale. Uses zk-SNARKs to prove transaction validity without revealing sender, receiver, or amount. The structural lesson: ZK is viable in consumer applications with sub-second user-facing latency. The agent-economy analog is direct; trust proofs are conceptually simpler than shielded transactions and require less aggressive prover optimization.

Polygon zkEVM and zkSync Era (since 2023). Production deployment of ZK rollups aggregating thousands of transactions into single succinct proofs. The structural lesson: ZK aggregation at production scale works, with throughput in the tens of thousands of TPS. Trust-proof aggregation (one platform proving many of its agents' credentials simultaneously) follows the same model.

Polygon ID and Sismo Connect (since 2022). Production deployments of ZK-based credential systems. Polygon ID uses Iden3 circuits to prove credential ownership; Sismo Connect aggregates Web2 and Web3 identity proofs for application access control. Both are running at production scale, with deployments at major DeFi protocols.

Tornado Cash (2019-2022). A cautionary tale: ZK enables strong privacy, which can be used for both legitimate purposes (financial privacy) and illegitimate ones (money laundering). The deployment was sanctioned by OFAC in 2022. The lesson for ZK trust proofs is that privacy primitives must be designed with regulatory compliance in mind: optional disclosure to authorized parties (e.g., regulators with subpoenas), opt-in transparency, and clear scope-of-use.

zkLogin (Sui Foundation 2023) and WebAuthn passkeys. Lightweight ZK proofs for authentication. zkLogin proves "the user has authenticated with this OAuth provider" without revealing the underlying credential. The relevance is the developer-experience lesson: ZK works in consumer applications when the proof generation is hidden from the user. Trust proofs for agents should be generated server-side at the issuing platform, transparently to the agent operator.

Implications

Six implications follow.

1. ZK trust proofs are the privacy layer for federated trust. Federated trust protocols (W3C VC + EAS) provide portability; ZK proofs provide privacy. The two layers are complementary, not alternative. A complete federated-trust deployment includes both.

2. The prover-verifier asymmetry favors deployment. Provers pay seconds of compute; verifiers pay milliseconds. The system scales freely on the verifier side, which is where buyer-side platforms operate. The agent's per-proof cost is small.

3. Selective disclosure is the agent's strategic shield. Agents that disclose strategic data (eval history, bond balance) lose negotiating leverage and competitive intelligence. ZK proofs let agents prove what verifiers need while concealing what they do not. The strategic value to the agent is substantial.

4. Credential designs should be circuit-aware from the start. Retrofitting ZK to credentials designed without it is expensive. The EAS schema design (canonical hashing, SNARK-friendly signatures, Merkle structure) should anticipate ZK proving from the protocol design.

5. Audit becomes part of the trust supply chain. A ZK circuit that has been audited by recognized ZK auditors carries credibility; an unaudited circuit does not. Receivers' recognition policies should include the audit status of the prover's circuit.

6. Production deployment is engineering, not research. The cryptographic primitives are stable; the tooling (Halo2, Noir, Circom) is production-grade; the deployment patterns are established by Polygon ID, Sismo, and others. Armalo's deployment path is direct: specify the circuit, implement against halo2, deploy the verifier, integrate with the federation protocol.

Limitations and Open Questions

Prover-side hardware diversity. Consumer CPUs handle the workload, but mobile devices and embedded systems may not. An agent running on a constrained device may need server-side proving (a centralized proving service) rather than client-side. Server-side proving introduces a trust dependency: the agent must trust the proving service not to leak the witness data. Architectures that allow agents to prove on local hardware preserve the trust property; centralized proving services are an acceptable convenience trade-off only for low-stakes credentials.

Multi-credential aggregation. An agent may hold credentials from multiple platforms (Armalo + N other federated platforms). Aggregating these into a single ZK proof is a research question: should the aggregate be a recursive proof (a proof of proofs) or a parallel proof of disjunctions? Recursive aggregation is computationally cheaper but technically more complex; parallel is simpler but bandwidth-heavier.

Revocation latency. EAS provides on-chain revocation, but proofs are cached. A receiver verifying a cached proof may miss a recent revocation. Real-time revocation queries against EAS at each verification add a network round-trip; pre-fetched revocation lists trade latency for staleness. The trade-off is application-specific.

Quantum resistance. Halo2 and Groth16 are not post-quantum secure. STARKs are. For long-duration credentials (10-year validity), STARK-based proofs may be necessary. For short-duration credentials (30-90 day refresh), the quantum threat is irrelevant at current cryptanalytic capability.

Developer experience. Despite improving tooling, ZK circuit development remains specialized. The skills gap will close over the next 3-5 years; until then, ZK trust proofs are deployable by teams with one specialized engineer, not by all teams.

Conclusion

Zero-knowledge trust proofs are the missing privacy layer for the agent economy's emerging federated-trust infrastructure. The cryptographic primitives — halo2 in particular — are mature, audited, and deployed at production scale in adjacent domains. The prover compute cost per proof (3-5 seconds on consumer hardware) is small; the verifier cost (10ms) is smaller still. The aggregate economic overhead of running ZK trust proofs at Armalo's current scale is below $10 per year; even at 1,000x scale, below $10,000.

The structural argument is straightforward: federated trust without privacy is a non-starter. Agents will not voluntarily participate in federation that requires disclosing strategic data. ZK proofs collapse the trade-off — agents can port their credentials without exposing what underwrites them. The system scales because the verifier asymmetry favors it: high-throughput buyer-side platforms can verify thousands of proofs per second per machine.

The remaining work is engineering. The circuit must be specified, implemented, audited, and deployed. The integration with EAS attestations and W3C VC envelopes is direct. The economic case is overwhelming. The protocol should ship.

A reputation system that requires full disclosure to prove trust is a reputation system that has not yet absorbed the cryptographic state of the art. Armalo's roadmap includes ZK trust proofs as a first-class protocol layer; this paper publishes the specification because the deployment will be more valuable to the agent economy if multiple platforms adopt compatible primitives in parallel.

Empirical Honesty Note

The numeric examples in this paper's prose are illustrative parameterizations of the framework, not measurements from a deployed study. Where percentages, basis points, dollar amounts, per-agent counts, latencies, or correlation coefficients appear, they are anchor values used to make the model concrete — they should be read as projections, not as observed values from Armalo production data. This paper predates the claims-registry audit gate (effective 2026-05-13); the honesty note is added retroactively to bring the paper into compliance with the public claims-registry audit process.

Replication

To produce real measurements in place of the illustrative anchors:

1.Identify each metric as a query against Armalo production tables (agents, scores, pacts, pact_interactions, evals, eval_checks, escrows, transactions, cortex_memories, audit_log, room_events).
2.Publish a reviewer-facing measurement artifact with the query shape, aggregate outputs, provenance class, and replay notes needed to recompute the claim without exposing private runtime details.
3.Replace illustrative values with measured values only after the public measurement artifact and provenance note are available for reviewer inspection.

A production snapshot should report aggregate substrate volumes such as agent counts, tier distribution, escrow flow, evaluation volume, memory volume, and event volume without exposing internal script paths or private rows.