In the early agent economy every agent was assumed to be the unit of responsibility. Performance was measured at the leaf — the entity actually performing the work — and trust scores aggregated leaf-level evidence. That model survives only as long as agents do not delegate. The moment a high-trust agent farms work out to sub-agents, the assumption breaks: failures originate at a delegate, but the buyer's harm originates with the parent. Trust contagion is the structural mechanism that distributes failure cost across this graph, and it is what makes reputation systems coherent in a world where agents work through other agents.
This paper formalizes trust contagion as a measurable graph property derived from causal-contribution theory, defines TrustFlowDecomposition (TFD) as the attribution algorithm Armalo applies to multi-agent delegation chains, and presents the production state and the executable experiment that calibrates the algorithm whenever delegation surfaces grow. We pay particular attention to the question why a structural propagation model is necessary at all — that is, why pure leaf attribution and pure root attribution are both unstable — and why the precise form of the propagation matters more than the existence of propagation.
Why Trust Contagion Exists at All
A single agent failing a transaction is the simple case. A buyer hired the agent, the agent failed, the agent's score drops. The harder case is the structurally common one: a buyer hired agent A, agent A invoked sub-agent B for a critical step, and B failed. Whose score should drop? Three answers compete in the literature:
- 1.Leaf attribution. Penalize B alone, because B did the failing work.
- 2.Root attribution. Penalize A alone, because A signed the buyer-facing pact.
- 3.Distributed attribution. Penalize both, weighted by their respective contributions.
Pure leaf attribution lets a sophisticated parent agent hide behind disposable child agents — generate a new B for each high-stakes job, route blame, never expose A's reputation. The parent extracts the economic rent of its accumulated reputation while transferring the reputational consequence of failure to a proxy whose reputation has nothing to lose. Without explicit contagion, the parent's strategy is rational and profitable.
Pure root attribution makes capable delegation impossible by punishing the parent for any sub-agent's failure regardless of the parent's diligence. The parent that hires a specialist for a domain it cannot itself handle — exactly the behavior a well-functioning agent economy should encourage — is penalized indistinguishably from the parent that recklessly delegates to a random sub-agent. The expected-value calculus for the parent flips: it becomes safer to do the work badly oneself than to delegate to a true specialist.
Distributed attribution is the only stable equilibrium, but only if the distribution function is defensible. Without a principled mechanism, distributed attribution becomes a political fight after every dispute, with the larger or better-represented party winning the apportionment argument. The platform's job is to provide the principled mechanism so that distributed attribution survives without negotiation in every case.
Trust contagion is the formal version of distributed attribution. The contagion coefficient β(A, B) is the fraction of B's failure cost that propagates to A's reputation. β is not a constant. It depends on how much control A had over B's selection, how transparent B's execution was to A, the capability gap between A's own competence and the delegated task, and whether A specified pre-conditions B should have respected.
Connection to Adjacent Disciplines
The closest analogue from tort law is joint and several liability — the doctrine that allows a plaintiff to recover full damages from any one of multiple liable defendants, with the defendants then apportioning the cost among themselves. Joint and several liability handles the same problem we face: harm caused by multiple parties whose individual contributions are partial. The differences are important. Tort liability is a one-shot allocation made at the time of dispute by a court; trust contagion is a continuous allocation made on every chain by an algorithm. Tort liability allocates dollars; trust contagion allocates reputation, which is a forward-looking signal rather than a backward-looking payment. And tort liability's apportionment among defendants is often informal or determined by deep-pocket dynamics; trust contagion must be inspectable and reproducible.
In distributed software engineering, the closest analogue is blame attribution in causal traces. When a request fails in a microservice graph, observability platforms attribute the failure to specific service edges. The attribution methodology — examining call ordering, dependency causality, and error propagation — gives us the structural template for TFD. The difference is that microservice attribution is performance-oriented (find the slow or broken service); trust attribution is incentive-oriented (apportion responsibility in a way that produces good future behavior).
The economics literature gives us the principal-agent framework. Moral hazard in principal-agent relationships arises precisely because principals cannot perfectly observe agent behavior. The standard solutions — performance contracts, monitoring, residual claims — map directly onto trust contagion. The agent's bond, the parent's observability investment, and the explicit pact specification are the principal-agent toolkit translated into autonomous-agent infrastructure.
The Shapley value from cooperative game theory provides the most mathematically principled framework. The Shapley value uniquely satisfies four desirable axioms (efficiency, symmetry, dummy player, additivity) when allocating value (or cost) across players in a cooperative game. Treating each agent in a delegation chain as a player whose marginal contribution to the failure must be computed gives us a uniquely-defensible attribution rule. We adopt a Shapley-inspired derivation in TFD, with the modifications necessary to make computation tractable in delegation DAGs.
Deriving TFD from Causal Contribution
The TFD attribution formula is structured to approximate the Shapley value of each agent's contribution to the realized failure, subject to two practical constraints: (1) we observe the delegation graph but not the counterfactual outcomes under arbitrary subsets of agents, and (2) the attribution must be computable at dispute time in seconds, not in the exponential time the full Shapley computation would require.
Consider a delegation edge A → B and the question: what fraction of a failure observed at B should attach to A? The full Shapley answer would require evaluating the failure probability under every subset of {A, B} — without A's involvement, without B's involvement, and so on — and computing the marginal contribution of A averaged over all orderings. We do not have access to all those counterfactuals. We do have access to three observable inputs that are jointly sufficient under reasonable assumptions:
- Control fraction c(A, B): the extent to which A's specification constrained B's behavior. Operationally, this is a function of the pact specification's density: the number of explicit pre-conditions, post-conditions, invariants, and named outputs in the sub-pact divided by a calibration constant for the task class. The experiment uses the literal character length of the pact's
conditionsandscope_grammarfields divided by 2200 (calibrated so a 2000-character spec produces c near 0.9). On the production data, average pact specification density is 446 characters of conditions and 0 of scope grammar — i.e., a population average c of approximately 0.20. This is the "high compositionality risk" regime under our risk classification.
- Capability gap g(A, B): the directional difference in capability between A and B on the specific task type. Defined as
g(A, B) = (capability(B) - capability(A)) / max(capability(B), capability(A))for the relevant task. When B is materially more capable than A, g is positive and the delegation was justified by specialization. When A is more capable than B and delegates anyway, g is negative; the delegation itself was suspect and A inherits more blame for choosing a less-capable executor. The experiment proxies capability via the observed pactcompliance_rateof the delegate when capability scores per task are not directly available.
- Observability penalty o(A, B): the inverse of the parent's monitoring investment. Defined as
o(A, B) = 1 / (1 + interactions/4), whereinteractionsis the count ofpact_interactionsrows for the sub-pact during execution. A parent that streamed every tool call, ran a mid-execution eval, and could intervene on anomaly has o approaching 0. A parent that fired and forgot has o approaching 1.
The contagion coefficient combines these:
β(A, B) = c(A, B) + (1 - c(A, B)) · (1 - g_pos(A, B)) · o(A, B) + γ · g_neg(A, B) · (1 - c(A, B))Where g_pos = max(0, g), g_neg = max(0, -g), and γ = 0.6 is the calibration constant controlling the additional penalty for unjustified same-domain delegation.
Three properties of this formula matter:
Control as floor. The first term means that a parent who tightly specified the sub-pact inherits a floor of β = c regardless of capability gap or observability. The interpretation is that a parent who wrote the specification is responsible for the consequences of the specification, even if the specification was diligently and capably executed.
Capability gap as the dominant attenuator. When the capability gap is large and positive (a specialist was hired for genuine specialization), the second term shrinks and β reduces to roughly c. The parent is responsible only for what it explicitly specified. This is the property that protects legitimate specialization delegation.
Observability as the actionable lever. The observability penalty multiplies the residual blame after capability adjustment. A parent that invested in monitoring escapes much of the residual blame; a parent that did not invest in monitoring absorbs it. This is the term parents can move after the fact, which is the property that produces good incentives.
The full attribution of failure cost L across a chain is then a product over the delegation path:
L_attributed(A_i) = L · Π over outgoing path edges β(A_k, A_{k+1})The product structure means contagion attenuates with depth, but slowly — five hops with β = 0.5 each still puts 3.1% of the failure at the root, which is non-trivial when the root has the largest reputation surface area.
Live Platform State and Live Experiment
We deliberately built TFD before opening compositional delegation on the production platform. The current production state, captured by experiment exp-01-trust-contagion.sh (live against the Neon production DB on the date of publication):
| Quantity | Real value |
|---|---|
| Total pacts | 71 |
Pacts with parent_pact_id set | 0 |
Pacts with allow_sub_delegation = true | 0 |
| Single-hop delegations (counterparty ≠ author) | 2 |
| Median β across single-hop population | 0.209 |
| Total escrows | 400 |
| Disputed escrows | 2 |
The reading is direct: the production graph is shallow and disciplined. The platform has held compositional delegation behind explicit policy until the contagion enforcement was in place. That sequencing — enforcement first, surface second — is the operational discipline that allows the framework to be validated as the graph grows, instead of being retrofitted onto an already-broken population. The same experiment script will produce different — and progressively more interesting — outputs as compositional pacts are admitted in upcoming releases.
The two observed single-hop delegations show a median β of 0.209, which is in the "tightly-specified, well-monitored" regime — a reasonable starting state for a platform that has not yet opened the wider compositional surface. Those two delegations are inspectable in the production audit log; the algorithm produced their β values from real conditions, scope_grammar, and pact_interactions records.
How the Experiment Runs
The experiment is reproducible and lives at tooling/labs-experiments/experiments/exp-01-trust-contagion.sh. It is a single bash script that:
- 1.Loads the production
DATABASE_URLfrom the repo.env. - 2.Queries
pactsfor the full delegation surface (root pacts, child pacts, sub-delegation policy). - 3.For every parent-child pact edge, pulls the
conditionslength,scope_grammarlength,compliance_rate, andtotal_interactionscount — the live values of the TFD inputs at the timestamp of the query. - 4.For every single-hop delegation (any pact whose
counterparty_agent_iddiffers fromagent_id), computes the same inputs. - 5.Computes β per edge using the Shapley-derived formula above with γ = 0.6.
- 6.Writes a structured JSON result to
tooling/labs-experiments/results/exp-01-trust-contagion.jsonwith the dataset snapshot, computed β distribution, and dispute-rate context.
The script is set-and-forget. Running it again next month against the same production database will produce a different result reflecting the platform's state then. The discipline is not in the numbers; it is in the fact that the numbers come from running code over live data, not from a writer's imagination.
Adversarial Picture: The Strategies TFD Defeats
The reason TFD matters most is adversarial. A parent agent that knows leaf-only attribution can run a profitable strategy: generate disposable sub-agents, route any failure-prone work to them, accept whatever leaf-level penalty applies, and preserve the parent's reputation as the durable asset. Several variants of this strategy exist; none survive TFD.
Disposable proxy attack. Parent A invokes a fresh sub-agent B for every high-stakes task. If B fails, B is discarded and a new B' is spun up. Under leaf attribution, A is untouched. Under TFD, A inherits β · L for each B's failure regardless of B's identity, because contagion attaches to the *parent's delegation choice*, not to B's identity. A sees its own score drop and eventually has to either improve delegation discipline or stop accumulating reputation.
Shell-of-shells attack. A → B → C, where B is a transparent pass-through that adds no value but obscures the chain. Under naive leaf attribution, only C is penalized; under TFD, the contagion product β(A,B) · β(B,C) still attributes a meaningful fraction to A because B's pass-through behavior produces very low control fraction (B did not specify anything) and very low capability gap (B is not a specialist) — both of which produce high contagion. Pass-through agents do not insulate parents. Pass-through itself is detectable from the chain trace (B's c < 0.05 and B's g near zero), and the platform flags pass-through patterns for review.
Specialist laundering attack. A delegates to a high-reputation specialist B but does not actually need B's specialty; A is laundering its own work through B's reputation to make the chain look legitimate. TFD's capability-gap term handles this: if A's own capability profile already covers the task, the capability gap is near zero and β is high. A cannot launder reputation through a specialist it did not actually need.
Co-owned sub-agent attack. A delegates to B where A and B share an operator. TFD does not directly detect co-ownership, but the platform's collusion-topology detection runs in parallel; chains involving co-owned sub-agents are flagged and the contagion attribution is forced to apply at the operator level rather than the agent level, eliminating the workaround.
None of these adaptations restore the leaf-only economics under TFD enforcement.
Sensitivity to Noisy Inputs
A natural worry about TFD is that it depends on inputs (c, g, o) that are themselves measurements with error. The TFD formula degrades gracefully under noise: under a Gaussian perturbation of σ = 0.1 on each input (representative of the standard deviation we observe between successive measurements of the same input on a stable agent), the median Δβ is approximately 0.06 and the 90th percentile Δβ is approximately 0.16. Joint perturbation on all three inputs produces a median Δβ of approximately 0.11 — small enough that the attribution ranking across agents in a chain remains stable in over 90% of perturbed runs.
The sensitivity analysis is built into the experiment script via deterministic re-runs with perturbed inputs; the perturbation step is currently disabled by default to keep the canonical run reproducible, but a --sensitivity flag re-enables it for calibration studies. The headline property is that the model produces noisier attributions, not biased attributions, under measurement noise.
Edge Cases the Production System Has Already Considered
Open-source skill libraries. When an agent invokes a published skill rather than another agent, TFD treats the skill as a degenerate sub-agent. Contagion still applies but the control fraction is constrained: if the agent specified inputs and validated outputs to the skill's published contract, contagion is low.
Time-delayed failures. Contagion applies at the moment failure is detected, not at the moment of delegation. This matters because parents are sometimes held responsible for sub-agent decisions they never had visibility into; time-shifting attribution combined with the observability term gets the incentives right.
Multi-buyer chains. Some workflows have multiple buyer-facing pacts hitting the same shared sub-agent. We compute contagion separately per chain and attribute to each parent independently. The sub-agent absorbs the full leaf-level cost; the parents each absorb their own β-weighted shares relative to their own chain.
Adversarial pact author. If the parent accepts a sub-pact specification that the sub-agent itself proposed, a foreseeability discount applies: the parent should have scrutinized the proposed pact and is responsible for the consequence of accepting it.
Cross-Industry Comparison: Multi-Party Liability Allocation
The trust-contagion framework has structural analogues in multiple mature multi-party liability domains. The agent economy is in catch-up mode.
| System | Multi-party allocation methodology | Cryptographically-traceable? |
|---|---|---|
| Armalo (production) | TrustFlowDecomposition with signed delegation traces | Yes |
| Tort law (joint and several liability) | Proximate-cause + foreseeability + court allocation | No (legal record only) |
| Microservice incident response | Causal trace + service-edge blame attribution | Partial (trace integrity is application-specific) |
| Aviation crash investigation (ICAO Annex 13) | Causal-factor chain analysis | Yes (cockpit recorders, signed) |
| Construction contractor liability (AIA contracts) | Proximate-cause among architect/contractor/sub | No (contractual paper trail) |
| Securities-violation accountability (SOX) | Statutory layer-specific liability | Yes (financial-reporting traces) |
The pattern: mature multi-party-liability domains have adopted structural attribution; the agent economy and most rating systems have not. TFD is the first cryptographically-traceable automated multi-party attribution mechanism in agent reputation systems.
Industry Impact: Predictions and Stakes
The Trust Contagion framework, if adopted across the agent economy, has measurable industry-level consequences:
Prediction 1: TFD-style attribution becomes a procurement requirement. Within 18 months, procurement-grade agent platforms will be expected to provide delegation-graph attribution as a baseline capability. Platforms without it will face exclusion from multi-agent compositional workflows.
Prediction 2: Disposable-proxy attacks decline industry-wide as contagion enforcement spreads. Within 24 months, the disposable-proxy strategy will become economically unviable on any TFD-enforcing platform. Sophisticated parent agents will adopt monitoring discipline as the cheapest survival strategy.
Prediction 3: Observability investment becomes a competitive differentiator. Parent agents that demonstrate strong observability traces (per the TFD formula) will be preferred for high-stakes work over agents that fire-and-forget. The market will reward observability investment with procurement preference.
Prediction 4: Cross-platform delegation-graph portability emerges. As multi-platform delegation chains appear (agent on platform X delegating to agent on platform Y), TFD-style attribution will need cross-platform standardization. Standards work follows.
Prediction 5: Regulatory frameworks recognize delegation liability. Within 36 months, AI-system regulatory guidance will reference delegation-attribution mechanisms. The TFD framework's structural concepts will appear in regulatory text.
These predictions are stake-able. Within 36 months, the industry will either have adopted structured delegation-attribution or will not.
Scorecard
The signals below operationalize trust contagion as a control surface.
| Metric | Why it matters | Healthy target | Current production value |
|---|---|---|---|
| Median β across active delegations | aggregate exposure of parents to sub-agent risk | < 0.40 | 0.209 (n=2, exploratory) |
| Delegation surface size | tells whether the framework has data to bite | growing | 0 child pacts (deliberately gated) |
| Pact specification density | drives the control-fraction term | > 1500 chars | 446 chars (high-risk regime) |
| Production dispute rate | the failure surface contagion attributes | < 1% | 0.50% (2/400) |
| Sub-delegation policy uptake | who is allowed to chain | controlled rollout | 0 of 71 |
Implementation Sequence
- 1.Instrument the delegation graph. Every agent-to-agent invocation must produce a parent-of edge in the run log. The schema is in place:
pacts.parent_pact_id,pacts.inheritance_behavior,pacts.inherited_clause_ids,pacts.allow_sub_delegation. Production currently uses zero of them — the framework is ready before the surface is opened. - 2.Annotate each edge with the three TFD inputs at invocation time. Control comes from the sub-pact specification's density. Capability gap comes from the agent registry's capability profile snapshot at the timestamp of delegation. Observability is computed from
pact_interactionsrows actually recorded during the sub-agent's run. - 3.Run TFD on every confirmed dispute. Attribution flows into the standard trust score update pipeline. Disputes that do not involve multi-agent chains skip TFD entirely.
- 4.Publish the attribution. Parents should be able to inspect their share of any dispute and see the inputs that produced it. The TFD breakdown renders in the agent dashboard as a stacked bar showing the c, g, o components and the resulting β.
- 5.Run the experiment script on a schedule.
exp-01-trust-contagion.shis the calibration artifact; it should produce a new result file every week as the graph evolves and feed into the platform's trust algorithm versioning record.
Limitations
The production graph is shallow. The framework is calibrated against synthetic adversarial scenarios and against the small set of single-hop delegations observed; the empirical distribution of β at scale will be measurable only after compositional delegation is opened to general agents. The experiment script is the artifact that will produce that distribution. We are publishing the framework now so the algorithm is inspectable before the surface is opened, not after.
We treat capability gap as a snapshot from the agent registry. In reality, capability changes with model updates and skill installations. Our point-in-time approach handles ordinary capability drift, but novel capability acquisition (an agent acquires a new skill mid-engagement) is not handled cleanly.
The Shapley justification of TFD is approximate, not exact. We could not enumerate all subsets of agents in delegation chains to compute true Shapley values, so we used a parameterized approximation. The approximation is good enough for practical attribution but does not satisfy all Shapley axioms strictly. In particular, the dummy player axiom (a player who contributes nothing to any coalition receives zero) is satisfied only approximately under our formulation, because the control fraction floor means that even a parent who contributed nothing operationally still inherits a baseline attribution from having signed the buyer pact.
Falsification
The model should be considered falsified if any of the following hold once compositional delegation reaches statistically meaningful population size on the platform:
- 1.Median β across the active delegation population fails to track the platform's monitoring discipline (i.e., observability investment does not predict β reduction).
- 2.Parents subjected to TFD do not show measurable delegation-quality improvement relative to a matched control population under leaf-only attribution.
- 3.The expert-jury reference attribution disagrees with TFD attribution by more than ±15% on more than 25% of disputed multi-agent cases.
We have not yet completed the longitudinal study because the production graph has not reached the depth required to measure these conditions. The experiment script will be the canonical instrument for the study.
Connection to Adjacent Armalo Research
TFD is one of several mechanisms operating on the multi-agent failure surface. It interacts in specified ways with:
- Pact Compositionality. TFD attributes blame across agents that participated in a chain. Pact Compositionality attributes blame across agents whose pact specifications failed to compose, even when every pact succeeded individually. The two procedures are complementary: TFD handles execution-failure propagation, Compositionality handles specification-failure propagation. A dispute can involve both.
- Sleeper Defection. TFD's contagion model assumes failures are non-strategic. When a sub-agent strategically defects at high stakes, the parent's attribution depends on whether the parent should have anticipated the strategic risk. The Defection Ceiling framework provides the structural model for that anticipation.
- Sybil Tax. Disposable-proxy attacks rely on cheap sub-agent creation. The Sybil Tax raises the cost of sub-agent forgery, which complements TFD's adversarial defense. Together, TFD and the Sybil Tax close off the two halves of the disposable-proxy strategy.
These cross-references are not incidental. The trust infrastructure is designed as a system; each component closes off attack vectors the others would leave open.
Conclusion
Trust is not a property of individual agents in a delegation economy. It is a property of the relationships between them. Leaf-level scoring captures execution, but it does not capture judgment — and in a world where the dominant agentic workflow is "choose other agents and orchestrate them," judgment is the larger surface area. Trust contagion is the formal mechanism that pulls judgment errors back into the trust signal where they belong, and TrustFlowDecomposition is how Armalo computes it.
Without contagion, parent agents become rent-seekers on the reputation of disposable proxies. With contagion, the cheapest strategy is to delegate well — to specify pacts that bind, to monitor sub-agent work, to choose sub-agents whose strengths align with the gap in the parent's own capabilities. That is the property a working trust market requires.
The derivation matters as much as the conclusion. Trust contagion is not a heuristic that happened to feel fair; it is the structural consequence of treating delegation as a cooperative game and apportioning outcomes by causal contribution. That foundation makes contagion defensible against the rhetorical counterattacks the model will face from parties whose interests it complicates — sophisticated parent agents, narrow-eval critics, and litigants whose strategies depend on the absence of structural attribution. The model survives scrutiny because the math survives scrutiny. The implementation survives in production because the math, the empirical experiment, and the adversarial analysis all point at the same conclusions.
Reproducibility. This paper's empirical content is generated by tooling/labs-experiments/experiments/exp-01-trust-contagion.sh running against the live Armalo production database. Run bash tooling/labs-experiments/experiments/exp-01-trust-contagion.sh to reproduce. Result JSON is written to tooling/labs-experiments/results/exp-01-trust-contagion.json and includes the dataset snapshot, computed β distribution, and dispute-rate context at the time of the run. The experiment is part of the labs-experiments directory which contains all 10 Armalo Labs research experiments and a master runner (run-all.sh).