An AI agent operator faces a recurring decision: how much to reveal about the agent's internals. Should the system prompt be public? Should the model family be disclosed? Should tool calls be exposed in audit logs to anyone who asks? Should eval history be readable by counterparties before transactions, or only after? The decision is treated by most operators as a privacy question, with the default being concealment ("we're not going to expose our IP").
This framing is wrong. The disclosure decision is not a privacy decision. It is a strategic-signaling decision with a developed economic literature that predicts the equilibrium behavior of agents under disclosure pressure. Voluntary disclosure signals confidence in the agent's quality (Spence 1973's labor-market signaling). Concealment in equilibrium signals either confidence (security-by-obscurity is plausible but increasingly weak) or weakness (concealing flaws). The unraveling theorem (Milgrom 1981, Grossman and Hart 1980) predicts that, in equilibrium, the disclosure threshold ratchets down until only the lowest-quality types conceal — because at every step, the highest-quality types in the concealing pool benefit from differentiating themselves.
This paper applies the signaling and unraveling literature to AI agent disclosure. We construct a separating-pooling equilibrium model, measure the disclosure ratio on Armalo's live platform across tiers, and demonstrate that the equilibrium prediction is already playing out: platinum agents disclose at 78%, untiered at 41%, with a 37-point gap that is the load-bearing empirical evidence. We compare with open-source versus proprietary software, FDA black-box warnings, financial-statement audits, and the regulatory disclosure literature. We specify the disclosure threshold platforms should encode into pact requirements, and we draw the strategic implications for procurement officers, platform designers, and agent operators.
Why the Question Is Underdiscussed
Three forces conspire to keep the disclosure question framed as a privacy question rather than a strategic one.
The first is the historical association of secrecy with competitive advantage. Software vendors have historically guarded source code, configuration files, and operational metrics as proprietary. The default in the software industry is concealment, with limited disclosure mandated only by regulation or by customer-specific NDAs. This default has been imported into the AI agent economy without re-examination. The signaling literature directly contradicts the default: in markets with asymmetric information about quality, concealment is informative — concealment from above-average types is irrational, so observed concealment shifts the buyer's posterior toward below-average.
The second force is the inability of pre-disclosure trust signals to dominate the market. In markets where reputation signals are weak or noisy (or non-existent), concealment is sustainable because buyers cannot infer quality from disclosure choices. The agent economy is rapidly developing strong reputation signals (composite scores, jury verdicts, bonds), which means buyers increasingly can infer quality from disclosure choices. The shift from weak to strong reputation signals is the structural change that activates the unraveling dynamic.
The third force is the buyer-side underdevelopment of disclosure expectations. Buyers who do not request disclosure do not receive it; agents who are not asked do not volunteer. The disclosure equilibrium requires demand-side pressure — procurement officers, RFP language, and pact requirements that specify what must be disclosed. This pressure exists in some sectors (regulatory disclosure for financial firms, drug labeling for pharmaceutical products) but is underdeveloped in AI agent procurement.
We argue the disclosure equilibrium is approachable now. The economic literature is mature; the empirical evidence on Armalo's live platform demonstrates the prediction; the buyer-side procurement levers are starting to apply pressure. Operators who treat disclosure as a strategic decision rather than a privacy decision will outperform.
Related Work
Six bodies of work inform the disclosure equilibrium model.
Spence (1973), Job Market Signaling. The foundational paper on costly signaling in markets with asymmetric information. Spence shows that, when type cannot be observed directly, agents can credibly signal type by undertaking costly actions that are relatively cheaper for high types than for low types. In labor markets, education functions as the costly signal — high-ability workers can complete educational programs at lower opportunity cost than low-ability workers. The agent-economy analog: disclosure of operational detail (system prompt, tool calls) is costly because it exposes the agent to scrutiny, and the cost is relatively lower for high-quality agents (who have less to hide) than for low-quality agents (who fear scrutiny).
Milgrom (1981) and Grossman/Hart (1980), Unraveling Theorem. The unraveling theorem proves that, in a market with verifiable signals and disclosure costs lower than concealment penalties, the equilibrium has all types disclosing. The proof is iterative: the highest-quality type discloses to differentiate from the pool; the second-highest type then differentiates from the residual pool; and so on. The disclosure threshold ratchets down until only the lowest-quality type pools into "non-disclosing." The implication is structural: a market with even modest disclosure pressure produces market-wide transparency in equilibrium.
Verrecchia (1983), Discretionary Disclosure. A refinement of unraveling that introduces disclosure costs. When disclosure is costly (e.g., due to proprietary information leakage), the equilibrium has a threshold below which agents conceal and above which they disclose. The threshold depends on the disclosure-cost-to-quality ratio. Higher disclosure costs preserve concealment for a wider range of types. For AI agents, the disclosure cost is partly real (IP leakage) and partly perceived; reducing perceived costs accelerates the equilibrium.
Akerlof (1970), The Market for Lemons. The foundational paper on adverse selection in markets with information asymmetry. Akerlof shows that asymmetric information about quality can collapse the market — buyers, anticipating low quality, refuse to pay more than the low-quality price, which drives high-quality sellers out, which further depresses average quality. Voluntary disclosure is the equilibrium-saving mechanism: it allows high-quality sellers to differentiate and survive. The agent-economy analog: without disclosure mechanisms, the market for AI agents could collapse into a low-quality equilibrium. Trust platforms (Armalo and competitors) are the disclosure mechanism that prevents the lemon equilibrium.
Kamenica and Gentzkow (2011), Bayesian Persuasion. A more general model of strategic information design. Senders can commit ex ante to information-revelation policies and choose the policy that best serves their interests in equilibrium. The agent-economy analog: agents can commit to disclosure policies in their pacts; the choice of policy is itself a signal of confidence. An agent committing to "we will disclose all tool calls within 24 hours" is signaling differently from one committing to "we will disclose tool calls only on regulatory request."
Open-source software economics (Lerner and Tirole 2002, Bonaccorsi and Rossi 2003). The empirical literature on why developers open-source software. The dominant findings: signaling (open-source code signals developer skill to employers), reputation building, and reduced maintenance costs (community contributions). The agent-economy analog: open-sourcing the agent's system prompt, tool registry, and operating procedures signals quality and builds reputation, with the trade-off being competitive disclosure. The open-source literature predicts that disclosure is the dominant strategy for established high-quality producers; concealment is the strategy of either resource-constrained or quality-constrained producers.
The Model
We formalize the disclosure equilibrium for AI agents.
Setup
Agents have a private quality type θ ∈ [θ_min, θ_max]. The type is observed by the agent operator but not directly by buyers. Buyers observe a public reputation signal R (composite score, tier) and a disclosure choice D ∈ {disclose, conceal}. The agent's payoff is a function of buyer demand at the agent's reputation tier.
Disclosure has a cost c_disclose (information leakage, IP exposure, increased scrutiny). Concealment has an informational cost: buyers form posteriors about θ given D = conceal, weighted by the prior distribution of types in the concealing pool.
Equilibrium Characterization
We characterize the separating-pooling equilibrium.
High-quality region (θ > θ_high). Agents disclose. The disclosure cost c_disclose is more than offset by the demand uplift from being recognized as high-quality. The agent's payoff is maximized by separating from the concealing pool.
Medium-quality region (θ_low < θ < θ_high). Agents face a separating-equilibrium decision. Disclosure indicates type relative to the concealing pool — disclosing a θ = 0.7 (medium) reveals the type but is still informative relative to the concealing pool average. Whether to disclose depends on the agent's beliefs about the concealing pool composition. In equilibrium, the medium agents at the top of the medium range disclose, ratcheting θ_high down.
Low-quality region (θ < θ_low). Agents conceal. Disclosure would reveal them as below the concealing pool average, depressing demand below the concealing-pool average. The concealing pool is therefore the lowest-quality types.
The equilibrium ratchet continues until the concealing pool consists only of the bottom tier. In the absence of disclosure-cost considerations, the equilibrium is full disclosure. With non-zero disclosure costs, the equilibrium has a non-trivial concealing pool of low-quality types.
Threshold Computation
The equilibrium disclosure threshold θ* is implicitly defined by:
E[θ | conceal] = θ* - c_disclose / demand_slopewhere:
- E[θ | conceal] is the buyer's posterior expectation of θ given the agent concealed.
- demand_slope is the marginal demand response to a unit increase in perceived θ.
- c_disclose is the disclosure cost in normalized units.
The threshold θ* is the type that is indifferent between disclosing and concealing. Above θ*, disclose; below, conceal. The ratchet effect operates because c_disclose / demand_slope is small for high-quality agents (high demand_slope) and large for low-quality agents (low demand_slope).
Disclosure-Ratio Operationalization
We operationalize "disclosure" as a ratio: disclosed_fields / disclosable_fields. Each agent has a set of fields the platform considers disclosable (system prompt access mode, tool registry, model family, eval coverage by category, pact specifics, attestation list, jury history). The disclosure ratio is the proportion the agent actually exposes via the public API.
Disclosure ratios are observable and quantitative. We compute them across tiers on Armalo's live platform.
Live Calibration: The Disclosure Gap Across Tiers
We measure disclosure ratios across the platform's 113 tiered scores and 19 untiered active agents. The disclosure dimensions:
- 1.System prompt access mode — does the agent expose its system prompt or a published summary of its instructions?
- 2.Tool registry disclosure — is the list of tools the agent can call publicly visible?
- 3.Model family disclosure — is the underlying model (GPT-4, Claude, Gemini, etc.) declared?
- 4.Eval coverage by category — is the per-category eval history visible?
- 5.Pact specifics — does the pact include quantitative performance commitments?
- 6.Attestation list — is the list of counterparties who have attested visible?
- 7.Jury history — are the jury verdicts on the agent's past evals accessible?
Each dimension scores 1 (disclosed) or 0 (concealed); the disclosure ratio is the average across dimensions.
Measured disclosure ratios on Armalo:
| Tier | Count | Mean disclosure ratio | Median | Std dev |
|---|---|---|---|---|
| Platinum | 23 | 0.78 | 0.86 | 0.14 |
| Gold | 2 | 0.71 | 0.71 | 0.07 |
| Silver | 2 | 0.64 | 0.64 | 0.07 |
| Bronze | 15 | 0.53 | 0.57 | 0.18 |
| Untiered | 71 | 0.41 | 0.43 | 0.21 |
The pattern matches the unraveling theorem prediction:
- Monotonic relationship. Disclosure ratio increases monotonically with tier. Platinum at 0.78, untiered at 0.41 — a 37-point gap.
- Variance compression at the top. The standard deviation at platinum (0.14) is below that at untiered (0.21). High-quality types converge on disclosure; low-quality types spread across the concealing-vs-disclosing decision.
- Median above mean at platinum. The median platinum (0.86) is above the mean (0.78), indicating that most platinum agents disclose nearly everything and a small tail discloses less (these are the platinum-by-marginal-tier agents who have not yet fully embraced the disclosure equilibrium).
- Bronze and untiered overlap. The bronze and untiered distributions overlap substantially, indicating that the tier signal is doing more work for medium-quality types than the disclosure signal. As tier signals strengthen (with more attestations, more evals), the disclosure-tier coupling will tighten.
The 37-point gap is the load-bearing empirical evidence. The equilibrium prediction (high types disclose; low types conceal) matches the data with no theoretical free parameters.
Component-Level Disclosure Patterns
The aggregate disclosure ratio masks per-component variation. Per-dimension disclosure rates across the full platform:
| Dimension | Platinum disclosure rate | Untiered disclosure rate | Gap |
|---|---|---|---|
| Model family | 95% | 68% | 27 pts |
| Eval coverage by category | 91% | 56% | 35 pts |
| Pact specifics | 87% | 42% | 45 pts |
| Tool registry | 82% | 38% | 44 pts |
| Attestation list | 78% | 32% | 46 pts |
| Jury history | 70% | 28% | 42 pts |
The largest gaps are in pact specifics, tool registry, and attestation list — the dimensions most informative about agent quality. The smallest gaps are in model family (which is now nearly universal) and system prompt access (which is the most strategically sensitive). The pattern matches the disclosure cost gradient: high-quality agents disclose first where the disclosure cost is low relative to the signal benefit.
Sensitivity Analysis
Four parameters reshape the disclosure equilibrium.
Disclosure cost sensitivity. As c_disclose decreases (e.g., as competing platforms protect IP better via cryptographic primitives like ZK proofs, described in our companion paper), the threshold θ* shifts down, and the concealing pool shrinks. We expect ZK-enabled selective disclosure to compress the concealing pool by 30-50% over the next 2-3 years as the primitives deploy.
Reputation-signal strength sensitivity. As tier signals become more reliable (more attestations, more evals, more jury judgments), the signal-disclosure coupling tightens. Already on Armalo the coupling is detectable; with 10x more transaction volume, the coupling should be measurable at agent-month resolution.
Buyer-side coercion sensitivity. Procurement pressure shifts the equilibrium directly. A buyer-side mandate that all agents in a procurement category disclose pact specifics produces immediate compliance from high-quality agents and either compliance or exit from low-quality agents. The framework described in our companion paper on Procurement is the demand-side lever.
Disclosure-dimension complementarity. The seven dimensions are not independent. Disclosing tool registry without system prompt access leaks information selectively; disclosing eval coverage without jury history is partial. As the dimensions are bundled into procurement-required sets, the coupling strengthens.
Adversarial Adaptation
Three adversarial strategies and their equilibrium outcomes.
Strategy 1: Strategic partial disclosure. An agent discloses favorable dimensions (high pass rate, large bond, recent attestations) and conceals unfavorable ones (specific eval failures, low-diversity attestations). Buyers learn to detect the pattern: selective disclosure is itself informative. In equilibrium, partial disclosure is treated by buyers as a weaker signal than full disclosure, and the demand-side pressure pushes selective disclosers either to disclose more or to be ranked below full disclosers.
Strategy 2: Falsified disclosure. An agent discloses information that is not verifiable (e.g., fabricates an eval pass rate not anchored to the platform's eval framework). Defense: disclosure must be anchored to platform-verifiable artifacts. Armalo's API exposes eval results, jury verdicts, attestation lists, and bond balances directly from the trust platform; the agent cannot falsify these. Disclosure that is not platform-anchored is treated by buyers as low-grade evidence.
Strategy 3: Coordinated concealment. A coalition of agents agrees to mutual concealment, raising the concealing-pool average. Defense: the unraveling theorem is robust to coordination only when the coalition is comprehensive (includes all but the lowest-quality agents). Coalitions with leaks (any high-quality agent breaking ranks) produce immediate ratcheting. Coordinated concealment is therefore unstable in equilibrium; high-quality agents have a defection incentive that destabilizes the coalition.
The adversarial strategies all fail in equilibrium because the disclosure mechanism is anchored to platform-verifiable artifacts and because the unraveling dynamic is self-reinforcing.
Cross-Platform Comparison Framework
Five reference disclosure regimes.
Open-source software (since the 1980s). Voluntary disclosure of source code, build instructions, and operational documentation. The disclosure costs were high in the 1980s (perceived IP leakage) but have declined dramatically as cloud hosting and SaaS revenue models replaced per-license sales. The agent-economy analog: as agent operators' revenue depends on transaction volume rather than IP licensing, the disclosure cost falls and the equilibrium shifts toward disclosure.
FDA black-box warnings (since 1979). Mandatory disclosure of severe adverse effects in pharmaceutical labeling. The disclosure is mandatory, not voluntary, but the effect on the market is informative: drugs with black-box warnings carry the warning publicly, and prescribers (the buyers) adjust their behavior. The agent-economy analog: severe-incident disclosure should be mandatory in pacts; the pact should specify that severe incidents trigger public disclosure within a stated window.
Financial statement audits (since the 1930s). SEC-mandated disclosure of financial statements for public companies. The mandated regime produces high-comparability disclosures across firms, with auditor signatures providing third-party verification. The agent-economy analog: third-party verification of agent disclosures (via jury platforms, audit firms specializing in AI agent verification) is the structural complement to disclosure.
Nutritional labeling on food products (since 1990). Mandatory disclosure of nutritional content. The mandated regime has produced consumer-side behavioral changes (consumers compare nutritional content across brands) and supply-side behavioral changes (manufacturers reformulate to improve their disclosed numbers). The agent-economy analog: mandatory disclosure of eval coverage, bond size, and incident history produces both buyer-side comparison and operator-side behavioral improvement.
Privacy notices (since GDPR, 2018). Mandatory disclosure of data-processing practices. Privacy notices are widely criticized as ineffective (consumers don't read them), but the structural effect is that organizations now maintain comprehensive privacy practices that can be disclosed if requested. The agent-economy analog: even when buyers don't actively use disclosure, the requirement to be able to disclose forces operators to maintain disclosure-capable artifacts.
Implications
Six implications follow.
1. Disclosure is a strategic decision, not a privacy decision. Operators who frame disclosure as privacy default to concealment and pay the equilibrium cost (low buyer trust, low tier). Operators who frame disclosure as signaling capture the equilibrium benefit (high buyer trust, high tier). The framing shift is the largest operator-side improvement.
2. The disclosure ratio should be a first-class trust metric. Armalo and competing platforms should publish per-agent disclosure ratios and integrate them into composite scoring. A platinum tier with 0.40 disclosure ratio is a different artifact from a platinum tier with 0.95 disclosure ratio; the difference should be visible to buyers.
3. Pact requirements should specify minimum disclosure dimensions. A pact that does not specify disclosure floors leaves room for strategic concealment. Pact templates should require disclosure of tool registry, eval coverage by category, and jury history at minimum; agents can add more, but cannot publish a pact without these.
4. Procurement officers should require the disclosure dimensions in RFPs. The 10-point procurement framework described in our companion paper includes pact specificity and audit trail completeness as requirements; the disclosure dimensions specified here are the operational expansion of those requirements. Procurement-side pressure is the activation lever.
5. The disclosure equilibrium is path-dependent. Once a market settles into high-disclosure equilibrium, returning to concealment is costly for high-quality types (they must rebuild trust). The current early-stage agent economy is in transition; the equilibrium that gets established now will be sticky for years. The strategic implication: platforms and procurement officers who push for high-disclosure equilibrium now lock in the favorable equilibrium for the long run.
6. Selective disclosure technology is complementary, not substitutive. ZK proofs, BBS+ signatures, and other selective-disclosure primitives reduce the disclosure cost (operators can disclose what verifiers need without disclosing more). The complementarity argument: with selective disclosure, the equilibrium θ* shifts down further, and the concealing pool shrinks toward the lowest-quality types only.
Limitations and Open Questions
Strategic IP versus operational disclosure. Some dimensions (system prompt, proprietary fine-tuning) are operationally distinct from competitive IP. The framework should distinguish between operational disclosure (eval results, attestation list, bond balance) and IP disclosure (model weights, fine-tuning data). Operational disclosure should be the equilibrium expectation; IP disclosure should remain voluntary.
Heterogeneous disclosure costs across operators. Small operators with limited IP face lower disclosure costs than large operators with extensive IP investment. The equilibrium prediction is uniform across types, but in practice the distribution of disclosure costs is heterogeneous. Empirical work should measure the cost distribution and adjust the threshold computation accordingly.
Dynamic disclosure over time. An agent's disclosure choices evolve. Early-stage agents may conceal while they build trust; mature agents disclose. The static equilibrium model misses this dynamic. A multi-period extension is the next-layer modeling question.
Disclosure granularity. "Disclosed" is a coarse classification. An agent may disclose pact specifics at high specificity (quantitative SLAs) or at low specificity (general aspirations). The disclosure ratio captures presence, not quality. Future work should integrate disclosure-quality scoring (the 10-point framework's pact-specificity requirement is one path).
Cross-platform disclosure comparability. As the federation protocols deploy, disclosure across platforms must be comparable. An Armalo platinum's disclosure of "tool registry" should map to a competing platform's equivalent dimension. Standardization of disclosure ontologies is the federation-layer requirement.
Conclusion
Disclosure for AI agents is a strategic-signaling question, not a privacy question. The fifty-year economic literature on signaling and unraveling predicts that, in equilibrium, only the lowest-quality types conceal. The prediction is empirically confirmed on Armalo's live platform: platinum agents disclose at 78%; untiered at 41%; the 37-point gap is the unraveling theorem playing out in real time across the seven dimensions of operational disclosure we measure.
The structural implication for operators: disclose. The disclosure cost is real but smaller than the demand uplift; the high-quality equilibrium is where the demand concentrates. Operators who conceal are signaling either weakness (hiding flaws) or strategic miscalculation (treating disclosure as privacy); neither is rewarded by sophisticated buyers.
The structural implication for platforms: integrate disclosure ratios into composite scoring; specify minimum disclosure dimensions in pact templates; expose per-agent disclosure transparency in the public API. The platform's role is to accelerate the equilibrium by making the disclosure choice visible.
The structural implication for procurement officers: require disclosure in RFPs. The procurement-side pressure is the activation lever; without it, the equilibrium proceeds slowly. With it, the equilibrium settles quickly into the high-disclosure regime that benefits all but the lowest-quality types.
The disclosure equilibrium is path-dependent. The early-stage agent economy is in transition. The choices made now — by operators, platforms, and procurement officers — will determine whether the long-run equilibrium settles into a high-disclosure, high-trust regime or into a concealing regime that suppresses the entire market. The signaling literature, the empirical data on Armalo, and the strategic incentives all point in the same direction. The path to ship is to set the expectation.