Insights

OperatorEscrow & settlement

The Escrow Floor: Why Bond Sizing Below One Day's Damage Means No Bond

2026-07-0122 minarmalo Team

A $50 bond on an agent that can cause $50,000 in damage in an afternoon is not a bond. The economics essay on minimum viable bond sizing as a function of damage potential.

Continue the reading path

Topic hub

Escrow

This page is routed through Armalo's metadata-defined escrow hub rather than a loose category bucket.

Strategic Guide

Agent Payments and Escrow

Curated Collection

Builder Guides

Pro checkout

Turn this trust model into a scored agent.

Start with a 14-day Pro trial, register a starter agent, and get a measurable score before you wire a production endpoint.

Start Pro on Stripe Compare plans

TL;DR

A bond is a credible economic commitment by an agent to good behavior, backed by funds the agent forfeits if the commitment is broken. Bonds work when the bond size is large enough that forfeiting it is more expensive than misbehaving. Bonds do not work when the bond size is smaller than the damage the agent can cause. This essay is the economics treatment of bond sizing in the agent economy. We argue that minimum viable bonds must be at least one day's worst-case damage potential, develop the Bond Floor Calculator as a reader artifact, show why most current agent bonds are theatrical rather than financial, and outline the structural problems with bond markets where sizing is inadequate. The Armalo composite weights the bond dimension at eight percent precisely because undersized bonds are common and need to be priced into the trust calculus.

The Bond That Was Not A Bond

A fintech operator integrates an autonomous trading agent from a third-party provider. The agent is bonded: it has posted fifty dollars in USDC into an escrow contract, marketed by the provider as a credible commitment to good behavior. The fintech operator runs the agent for a quarter, and during that quarter the agent makes a series of trading decisions that would have caused, in the worst case the operator can imagine, several thousand dollars in losses on a single bad day. Most days the agent performs well. The bond, the operator implicitly assumes, is meaningful skin in the game.

In week eleven, the agent malfunctions. A misclassification in its market signal pipeline causes it to enter a series of positions in a thinly traded market that, when liquidated under stress, generates eighteen thousand dollars in losses for the fintech operator's customer accounts in a single afternoon. The operator pursues the bond. The escrow contract releases the fifty dollars. The provider, located in a jurisdiction that makes legal recourse impractical and operating with limited capital reserves, declines to make additional payment. The fintech operator is on the hook for the remaining seventeen thousand nine hundred fifty dollars, which they make good to their customers because they have to, and they absorb the loss themselves.

The bond was not a bond. It was a marketing artifact. The fifty dollars communicated to the operator that the provider had skin in the game, but the skin was so small relative to the damage that it provided no real economic incentive for the provider to prevent the malfunction and no real economic remedy for the operator when the malfunction occurred. The bond was performative compliance with the requirement to have a bond, without any of the substance that makes bonds actually function as commitment devices. The fintech operator paid the seventeen-thousand-nine-hundred-fifty-dollar lesson, which is what most operators pay, eventually, when they discover their counterparty's bonds are theatrical.

This is the structural problem with the current state of agent bonding. The notion of bonded agents has spread faster than the methodology for sizing bonds appropriately. Agents post nominal bonds because the requirement says they must, but the bonds are often orders of magnitude smaller than the damage the agents can cause. The result is a market in which the appearance of skin in the game has been widely adopted but the substance has not, and operators who take the appearance at face value end up systematically underestimating their counterparty risk.

The fix is bond sizing methodology that treats the bond as a function of damage potential rather than as a token gesture. The minimum viable bond, what we will call the bond floor, must be at least one day's worst-case damage. Bonds below this floor do not function as commitment devices, and operators integrating bonded agents should treat undersized bonds as no bond at all, rather than as small bonds. The distinction matters: a small bond suggests partial coverage, but a sub-floor bond provides no coverage in any meaningful sense, because the damage in any single bad day exceeds the entire bond. The agent has nothing to lose at the day-level decision time horizon, which is exactly the time horizon at which most damaging decisions are made.

In this essay we will develop the economic theory of bond sizing, build the Bond Floor Calculator that operators can use to compute minimum viable bonds for their use cases, examine the structural failure modes of undersized bonds, and propose what a serious bond market for agents would look like at scale. The thesis is uncomfortable for many existing agent providers: most current agent bonds are insufficient to function as bonds, and operators who treat them as meaningful are exposing themselves to losses that no amount of trust score can compensate for.

What A Bond Is, Economically

A bond is not insurance. A bond is not a fee. A bond is a credible commitment device that aligns the bonded party's incentives with the bondholder's interests by putting the bonded party's funds at risk in proportion to the harm the bonded party could cause. This distinction matters because it determines what makes a bond function and what makes it fail.

Insurance is a contingent payment from a third party, the insurer, to the harmed party, conditional on a defined trigger. The insurer pools risks across many insureds and prices the premium to cover expected payouts plus operating expenses plus profit. Insurance does not necessarily change the insured party's behavior, because the insured party may not be the one paying premiums, or the premiums may not be sensitive enough to behavior to incentivize change. Insurance compensates damage; it does not necessarily prevent it.

A fee is a payment by one party to another for a service, with no contingency on outcomes. Fees fund the operation of the receiving party but do not change incentives. The party paying the fee has no recourse if the service goes wrong; they have already paid, and the fee was for the service, not for performance.

A bond is structurally different. The bonded party puts their own funds into a third-party-controlled escrow. If they perform as committed, the funds are returned to them. If they do not, the funds are forfeited, partially or wholly, to compensate the harmed party. The bonded party's incentive to perform is direct: they lose their own money if they fail. The bondholder's economic position is partially protected: they have a defined remedy that does not require litigation against the bonded party. The third-party escrow provides credibility: neither party can unilaterally withdraw or release the funds.

The credibility of a bond depends on three properties. First, the bond must be large enough that forfeiting it is more painful for the bonded party than the gain from misbehavior. If the bonded party can cause one hundred thousand dollars in damage to extract a fifty-dollar gain to themselves, a fifty-dollar bond is not a deterrent because the gain exceeds the cost. The bond size must be calibrated to the magnitude of the temptation to misbehave, which in turn depends on the damage potential of the bonded activity. Second, the forfeiture mechanism must be reliable. If the bonded party can prevent forfeiture through procedural manipulation, jurisdictional arbitrage, or counterparty exhaustion, the bond is a paper protection rather than an operational one. Third, the bond must be visible to counterparties before the transaction, so that counterparties can decide whether to transact based on the bond's adequacy. A secret bond is no bond at all from the counterparty's perspective.

These three properties define what a real bond looks like. The first, sizing, is the focus of this essay. The other two are largely solved at the contract layer in modern crypto-native escrow systems, with smart-contract-enforced forfeiture and on-chain visibility that makes both reliable and observable. The sizing problem is harder because it requires economic analysis of the specific use case, which most current bond-posting practices skip.

The failure to size bonds appropriately is the analog of capital adequacy failure in banking. A bank with insufficient capital is technically a bank but is not actually intermediating risk; it is just collecting deposits and hoping not to need its capital. A bonded agent with insufficient bond is technically bonded but is not actually committing economically; it is just posting a token and hoping not to need it. The banking system spent decades developing capital adequacy frameworks that scale capital requirements to the risks the bank is taking. The agent economy needs the equivalent for bonds, and this essay is a contribution to building it.

Damage Potential As The Sizing Anchor

Bond sizing must anchor on damage potential, which is the maximum economic harm the bonded agent can cause to the operator and the operator's downstream parties in a defined time window. Damage potential is not the expected damage; it is the worst-case damage. Expected damage is what insurance prices. Worst-case damage is what bonds size against. The distinction matters because bonds need to function as commitment devices, and commitment devices must be sized against the temptation, which is set by what the agent could do, not what it usually does.

Damage potential has several components that need to be aggregated to produce a sizing target. Direct financial damage is the most obvious: the dollar value of decisions the agent could make that destroy capital, transfer it to wrong parties, or create liabilities for the operator. For a trading agent this is the maximum loss in a single trading session. For a payments agent this is the maximum unauthorized transfer or duplicate payment in a single processing window. For a code-writing agent this is the cost of remediating bugs that ship to production before detection.

Indirect financial damage comes from second-order consequences. A customer-support agent that gives bad advice generates direct damage in the form of refunds and remediation, but it also generates indirect damage in the form of customer churn, brand impairment, and downstream support burden. Indirect damage is harder to quantify but should not be ignored; in many use cases the indirect damage substantially exceeds the direct damage. The sizing methodology needs to estimate indirect damage even when the estimation is uncertain, and a reasonable starting point is two to five times direct damage in customer-facing use cases.

Regulatory and compliance damage is the third component. An agent that violates a regulation can trigger fines, enforcement actions, license suspensions, and remediation costs that dwarf the direct financial impact of the violation. For agents operating in regulated domains, the regulatory damage potential should be estimated based on actual fine schedules and enforcement history, and it can easily reach hundreds of thousands or millions of dollars for a single violation in domains like financial services, healthcare, or data protection. The bond sizing should reflect this exposure or the bond will not function as a commitment device against the regulatory tail risk.

Reputational damage is the fourth and hardest to quantify component. An agent failure that becomes public can damage the operator's reputation, the agent provider's reputation, and the broader confidence in the agent economy. This damage propagates over time and is difficult to recoup even if the underlying incident is remediated. For agents operating in high-visibility contexts, reputational damage can exceed all other damage components combined. Sizing methodologies often skip reputational damage because it is hard to quantify, but skipping it produces bonds that are inadequate against the most economically significant tail risks.

Time-window scaling is the fifth element. Damage potential per day is the natural scale for autonomous agents that make many decisions per day. Damage potential per hour or per minute may be more appropriate for very high-frequency agents. Damage potential per week may be more appropriate for low-frequency agents that take occasional high-impact actions. The right time window is the operational granularity at which a malfunction would be detected and stopped: shorter for systems with rapid monitoring, longer for systems with delayed visibility. The bond should cover at least one detection window of damage, which for most production systems is one business day.

Aggregating these five components produces a damage potential figure that the bond should be sized against. The aggregation is not just a sum, because the components are not independent; reputational damage depends on whether direct damage occurred and how visible it became. A practical aggregation rule is that the bond floor should equal the maximum of (direct damage in one day, indirect damage in one week, regulatory exposure for a single moderate violation, reputational damage estimate for a single visible incident). This aggregation produces conservative but tractable bond floors that operators can use to evaluate whether posted bonds are adequate.

The operational consequence is that bonds appropriate for low-stakes agents are inadequate for high-stakes agents, and a single posted bond cannot serve agents operating across stakes tiers. Bond sizing must be use-case specific, with the agent's bond effectively being multiple sub-bonds, one for each use case the agent serves, with sizing appropriate to each. This is more operationally complex than single-bond models but more accurate and more honest about the actual commitment the bond represents.

Why The One-Day Worst-Case Floor

The specific recommendation that bonds must be at least one day's worst-case damage potential is not arbitrary. It is the operational equivalent of capital adequacy at the most relevant time horizon for autonomous agent decision-making, and it has economic and behavioral justifications that are worth making explicit.

The economic justification starts with the question: when does the bonded party feel the cost of forfeiting the bond? If the bond is forfeited months after the misbehavior, the time-discounted cost is small relative to the immediate gain from misbehaving. If the bond is forfeited the same day as the misbehavior, the time-discounted cost approaches the bond's nominal value. Bonds that pay out on rapid timelines are more credible commitment devices than bonds that pay out on slow timelines, because the bonded party perceives the cost as more proximate to the temptation. The one-day window aligns the cost timeline with the decision timeline, which is the strongest version of the commitment.

The one-day window also aligns with the operational reality of how agent failures are detected and stopped. Most production agent monitoring systems run on cycles measured in minutes to hours. A serious malfunction is typically detected and stopped within one business day of onset. Damage that accumulates within that detection window is the damage that no monitoring or kill-switch infrastructure can prevent, and it is therefore the damage that the bond must be sized against. Damage beyond the one-day window is preventable through monitoring and intervention, and while the bond should still cover those scenarios, the binding sizing constraint is the within-detection-window damage.

The behavioral justification draws on prospect theory. Bonded parties evaluate forfeiture as a loss, and losses loom larger than equivalent gains in human and arguably in agent decision-making. A bond sized at one day's worst-case damage creates a loss prospect that is large enough to dominate the gain prospects from most plausible misbehavior strategies. Bonds sized below this threshold create loss prospects that are smaller than the gains from misbehavior, and the bonded party's loss-averse calculus tips toward the misbehavior. Bonds sized substantially above one day's worst-case damage create loss prospects so large that they may be perceived as out of scale with the actual risk, leading to either bond avoidance or overcompensating risk premiums. The one-day floor sits at the boundary where the loss prospect is large enough to bind decisions without being so large as to distort other parts of the system.

The one-day floor is also operationally tractable in a way that more conservative or more permissive thresholds are not. More conservative thresholds, such as one week of worst-case damage, would produce bonds large enough to exclude smaller agent providers from the market entirely, which is bad for competition and ecosystem development. More permissive thresholds, such as one hour of worst-case damage, would produce bonds too small to cover even moderate single-incident damage. The one-day floor balances ecosystem development against commitment credibility in a way that lets a wide range of agent providers participate while maintaining real economic stakes.

The one-day rule should be understood as a floor, not a ceiling. Operators with low risk tolerance, regulated counterparties with specific compliance requirements, or use cases with extreme tail risks should require bonds well above the one-day floor. Markets will probably develop pricing that scales bond size to procurement-side risk preferences, with operators paying more for higher-bonded counterparties and accepting lower bonds in exchange for lower pricing on the lower end. The floor is the minimum below which the bond ceases to function; above the floor, sizing is a market-level negotiation between operators and providers.

The alternative to the one-day floor is the status quo, where bonds are sized according to provider convenience rather than damage potential, and where operators face systematic underprotection without a benchmark to compare against. The one-day floor gives operators a defensible benchmark: bonds at or above the floor are commitment devices, bonds below the floor are theatrical, and the procurement decision can be made on this binary plus secondary factors.

The Bond Floor Calculator (Reader Artifact)

The Bond Floor Calculator, BFC, is a structured methodology for computing the minimum viable bond for a given agent use case. The output is a single dollar figure representing the bond floor: the minimum bond below which the agent cannot be considered meaningfully bonded. The calculator runs in five steps and can be executed in under thirty minutes for a defined use case.

Step one is direct financial damage estimation. Identify the maximum dollar value of decisions the agent can make in one day that could destroy capital, transfer it to wrong parties, or create direct financial liabilities. For a trading agent, this is the maximum portfolio drawdown in one trading session. For a payments agent, this is the maximum unauthorized transfer volume in one processing window. For an inventory agent, this is the maximum value of inventory that could be misallocated. The estimation should be the worst case, not the expected case, and should consider scenarios where the agent malfunctions silently rather than failing visibly.

Step two is indirect financial damage estimation. For customer-facing agents, multiply direct damage by a factor representing customer churn, refund processing, and downstream support burden. The factor is typically two to five for moderate-touch use cases and can be five to ten for high-touch use cases. For non-customer-facing agents, indirect damage may be smaller relative to direct, but should still include cost of remediation, system downtime during cleanup, and opportunity cost of resources diverted to handle the incident.

Step three is regulatory exposure estimation. For agents operating in regulated domains, identify the regulatory frameworks that apply, the maximum penalty schedule for a single violation in each framework, and the probability that a single agent malfunction would be classified as a violation. Multiply the maximum penalty by the violation probability to get the expected regulatory damage per malfunction. Use the maximum-penalty figure rather than expected-penalty for sizing the bond, because the bond must cover the tail rather than the mean.

Step four is reputational damage estimation. For agents operating in high-visibility contexts, estimate the dollar value of brand impairment from a single visible failure. This is hard but necessary. A useful starting heuristic is the cost of running a public-relations response to a comparable incident in your industry, plus an adjustment factor for customer acquisition cost increases over the following year. For lower-visibility agents, this component may be small or zero, but it should be considered explicitly rather than skipped.

Step five is aggregation and floor computation. Take the maximum of the four damage components: direct one-day damage, indirect one-week damage, regulatory exposure for a moderate violation, reputational damage for a visible incident. The bond floor is this maximum. Bonds at or above this floor are commitment devices appropriate to the use case. Bonds below this floor are theatrical and should not be relied upon as risk mitigation by the operator.

The BFC produces conservative bond floors. Real-world bonds often sit substantially below the floors the BFC produces, particularly for emerging agent markets where bond sizing conventions have not matured. The BFC's purpose is to give operators a benchmark against which to evaluate posted bonds, not to mandate that all bonds rise to the floor. Operators who choose to integrate sub-floor bonded agents are making an informed risk decision, with the gap between posted bond and BFC floor representing the uncovered exposure they are accepting. This informed decision is much better than the alternative, in which operators integrate bonded agents without any framework for evaluating bond adequacy and discover the inadequacy only after a damaging incident.

The BFC includes worked examples for several common agent use cases: customer support agents in regulated and unregulated industries, code-writing agents in production and non-production contexts, trading agents in high-liquidity and low-liquidity markets, payments agents in domestic and cross-border contexts, content moderation agents in high-visibility and low-visibility contexts. The worked examples produce bond floors that range from a few hundred dollars for low-stakes unregulated use cases to several million dollars for high-stakes regulated use cases. The wide range reflects the wide variation in damage potential across use cases and reinforces the point that single-bond models cannot serve agents operating across stakes tiers.

The Theater Bond Problem

The agent market in 2026 has a substantial theater bond problem. A theater bond is a posted bond that is too small to function as a commitment device, but that is marketed to operators as if it were a meaningful financial commitment. Theater bonds spread because they satisfy the requirement that bonds exist without imposing the capital cost that real bonds would impose, and because the market has not yet developed the diligence discipline to call them out. The theater bond problem has structural consequences for the agent economy that are worth examining.

The first consequence is operator complacency. Operators who see bonded counterparties assume the bonds provide protection, and they reduce their own due diligence and monitoring accordingly. The bond becomes a substitute for the operator's own risk management, which would be appropriate if the bond were adequate but is dangerous when the bond is theatrical. The operator's actual risk position is much worse than they think it is, because they have outsourced part of their risk management to a bond that does not actually carry the risk.

The second consequence is provider selection pressure toward theater. Providers that post adequate bonds tie up capital that could be deployed elsewhere. Providers that post theater bonds tie up minimal capital. In a market without diligence on bond adequacy, both providers can market themselves as bonded, and the market does not differentiate between them on the bond dimension. The provider posting the theater bond has lower capital costs and can compete on price, while the provider posting the adequate bond cannot recoup the capital cost in a market that does not value it. The result is selection pressure away from adequate bonding, which weakens the entire trust infrastructure.

The third consequence is incident pattern distortion. When theater bonds dominate, incidents that should be remediable through bond forfeiture become operator losses or customer losses. The incident pattern accumulates without any market mechanism to redirect it back to the providers whose agents caused the incidents. Providers see no economic feedback from their agents' failures, and the failures continue. Operators bear the costs and develop workarounds, which divert resources from productive use to defensive use. The cumulative effect is a higher cost of running agent-integrated workflows than would obtain in a market with adequate bonding, with the cost showing up in operator margins rather than provider economics where it would create improvement pressure.

The fourth consequence is regulatory escalation risk. As incidents accumulate without adequate bond-mediated remediation, regulators eventually intervene with prescriptive bond requirements, often calibrated to worst-case scenarios that produce bond requirements far above what voluntary market action would have produced. Regulators do this not because they prefer prescriptive regulation but because they observe market failure and need to respond. The agent economy could avoid this regulatory escalation by self-regulating through bond adequacy norms before the incident pattern forces regulatory intervention. The window for self-regulation is narrowing.

The fifth consequence is the most insidious: the theater bond problem creates a structural disincentive against bond market development. Why would an institutional capital provider stake bonds against agents if the market does not differentiate adequate bonds from theater bonds and does not pay a premium for adequate bonding? Without market premium, there is no business model for bond capital provision, and the bond market remains a corner of niche providers rather than developing into the deep capital market the agent economy needs. The theater bond problem starves the bond market of the institutional capital that could solve it.

The fix is two-sided. Operators need to develop diligence on bond adequacy, using frameworks like the BFC to evaluate posted bonds and treating sub-floor bonds as no bond at all in their procurement decisions. Providers need to either size bonds to the floor or honestly disclose that their bonds are below the floor and let the market price the difference. Eval and trust infrastructure providers, including Armalo, need to surface bond adequacy in their public scoring, so that operators can see at a glance whether a counterparty's bond is meaningful or theatrical. None of these is sufficient alone; together they can produce a market that prices bond adequacy correctly and starves the theater bond pattern of oxygen.

What A Mature Bond Market Looks Like

A mature bond market for agents will look structurally different from the current state. The current state is provider-posted bonds in arbitrary amounts, with no standardized sizing methodology and no market infrastructure for institutional bond capital. The mature state will have third-party bond capital providers, standardized sizing methodologies, secondary markets for bond positions, and integration with the broader trust infrastructure. This section sketches what the mature state probably looks like, with the caveat that the specific institutional shape will depend on choices the market makes over the next several years.

Third-party bond capital providers are the most important missing piece. Currently, bonds are posted by the agent provider directly, which means bond capital is constrained by the agent provider's balance sheet. In a mature market, third-party capital providers, similar in function to insurance underwriters or surety bond issuers, will post bonds on behalf of agent providers in exchange for a premium. The capital provider performs underwriting on the agent's risk profile, sets the premium based on the underwriting, and posts the bond. The agent provider pays the premium and benefits from the bond's market signaling. The capital provider absorbs forfeiture losses and prices premiums to cover expected losses plus operating costs plus return on capital. This separates bond capital provision from agent provision, allowing both to specialize and scale.

Standardized sizing methodologies are the second piece. The BFC presented in this essay is one possible methodology; mature markets will probably converge on a small number of methodologies that become industry standards, with regulators perhaps codifying minimum standards in regulated domains. The standardization makes bonds comparable across providers and use cases, which is the precondition for a functioning market. Without comparability, every bond is a bespoke instrument and the market cannot develop the depth and liquidity that institutional capital requires.

Secondary markets for bond positions are the third piece. Once third-party capital providers exist and bonds are standardized, secondary markets will develop where bond positions can be traded, allowing capital providers to manage their exposure dynamically rather than holding to maturity. Secondary markets will probably develop in the form of tokenized bond positions on the same blockchain infrastructure that handles primary bonds, with pricing that reflects ongoing assessment of agent risk. This is a significant institutional development and probably a five-to-ten-year horizon, but it is the mature endpoint of the bond market evolution.

Integration with broader trust infrastructure is the fourth piece. Bonds will be one component of a multidimensional trust profile that includes evaluation scores, reputation history, certification tier, and financial commitment. The Trust Oracle exposes all these dimensions, and counterparties can route procurement decisions through the combined profile rather than through any single component. Bonds are particularly important because they are the most legible economic signal: a bond's adequacy is directly comparable to the operator's damage exposure in a way that an evaluation score is not. Mature trust infrastructure will surface bonds prominently and make bond adequacy a primary diligence dimension alongside score and reputation.

Regulatory integration is the fifth piece, particularly in domains where regulatory frameworks already exist for bonded financial activities. Securities regulators, banking regulators, healthcare regulators, and consumer protection regulators all have frameworks for bonded activities in their respective domains. Agent bonds will probably be brought under these frameworks over the next several years, with the agent operator's regulatory exposure being calibrated to the bond adequacy of the agents they integrate. This will create regulatory pressure for adequate bonding that complements the market pressure, which together will probably accelerate the transition from theater bonds to adequate bonds faster than market pressure alone would achieve.

The mature bond market will be a load-bearing piece of the agent economy infrastructure. It will absorb risks that operators currently absorb themselves, creating capacity for operators to take more agent integration risk than they currently can. It will price the cost of agent malfunction in the form of premium spreads, creating economic feedback to agent providers that improves agent quality over time. It will provide remediation capacity that current agent infrastructure largely lacks, smoothing the financial impact of incidents on operators and end users. None of this exists yet at scale, but the building blocks are forming, and the trajectory is reasonably clear.

Counter-Argument: Bond Floors Will Strangle Innovation

The strongest counter-argument is that requiring meaningful bond floors will exclude smaller agent providers from the market, concentrate the agent economy in a few well-capitalized incumbents, and slow innovation. The current low-bond regime, the argument goes, is a feature rather than a bug: it allows experimentation, lets new providers compete on capability without being capital-constrained, and produces the fast iteration that early markets need. Imposing meaningful bond floors before the market matures would freeze the market in its current shape and prevent the next generation of agent providers from emerging.

The response is that the counter-argument confuses bond requirements with bond adequacy disclosures. The proposal here is not that all agents must post bonds at the BFC floor; it is that bond adequacy must be disclosed and that operators must be able to evaluate posted bonds against meaningful benchmarks. Smaller providers can continue to operate with smaller bonds; they just need to disclose that their bonds are below the BFC floor and let the market price the gap. Procurement decisions can take this into account, with operators choosing to integrate sub-floor bonded agents in lower-stakes use cases where the bond gap is acceptable. The market sorts itself based on disclosed information rather than being centrally constrained.

The second response is that the current low-bond regime is not actually serving smaller providers well. Smaller providers compete in a market where their bonds and large providers' bonds are presented as equally meaningful, which means smaller providers cannot signal their actual financial commitment relative to the use case. Real bond market development would let smaller providers buy bond capacity from third-party providers, scaling their bond commitment without scaling their balance sheet, which would actually open more competitive space than the current arrangement does. The transition might be uncomfortable for some incumbents who currently benefit from the theater bond status quo, but the long-run effect of bond market maturation is more competition and more provider diversity, not less.

The third response is that innovation in agent capabilities and innovation in agent trust infrastructure are different things, and the current low-bond regime is innovation in capabilities at the cost of trust infrastructure underdevelopment. The agent economy needs both innovations to scale to its potential. Continuing to skimp on trust infrastructure to favor capability innovation eventually produces a ceiling: capabilities advance, but adoption stalls because counterparties cannot trust the agents with consequential workflows. Building real trust infrastructure, including real bond sizing, removes that ceiling and enables capability innovation to translate into adoption. The two are complementary, not in tension.

The fourth response is that the worst-case outcome of inaction is regulatory imposition of bond requirements on a much faster and more prescriptive timeline than market self-regulation would produce. The current trajectory of incident accumulation in agent-integrated workflows is leading toward regulatory attention in the near to medium term. Regulators do not have the same nuance as market participants in calibrating bond requirements; they tend to impose conservative requirements that ensure problems do not recur, which produces requirements far above what self-regulation would produce. The agent economy is much better off self-regulating to adequate bonds than waiting for regulators to mandate excessive bonds. Self-regulation requires moving sooner rather than later on bond adequacy norms.

The counter-argument is right that bond floors imposed crudely could damage the market. The response is that bond floors imposed as disclosure norms rather than as participation requirements do not damage the market; they let the market sort participants by adequacy and price the gaps. This is the right shape for the intervention and the one most likely to produce healthy market evolution rather than market freezing.

What Armalo Does

Armalo's composite score weights bond adequacy at eight percent, with the bond sub-score computed from the ratio of posted bond to the BFC floor for the agent's primary use cases. Bonds at or above the floor receive full credit on the bond dimension; bonds below the floor receive partial credit scaling to zero at zero bond. This means that a Gold-tier composite score is not achievable by any agent whose bond is more than seventy percent below the BFC floor, regardless of how high the other dimensions are. The Trust Oracle exposes both the posted bond amount and the BFC-computed floor for each agent, so operators can see the gap directly rather than having to compute it themselves. Bond verification is on-chain through the Armalo escrow contracts on Base L2, with USDC as the bond currency for liquidity and price stability. The escrow contracts implement automatic forfeiture triggered by adverse jury verdicts on pact violations, with operator-side controls to release funds in disputed cases pending resolution. Bond posting and topping up is supported through the agent operator dashboard with a few-click flow that handles the on-chain transactions and the off-chain registration in the Trust Oracle. Operators can subscribe to bond-adequacy alerts that fire when a counterparty's bond falls below the BFC floor for the use cases the operator integrates the counterparty into, allowing reactive risk management as use cases evolve. The infrastructure is designed to make adequate bonding the path of least resistance, not just the path of best practice.

FAQ

Why use one day's worst-case rather than expected damage? Because bonds are commitment devices, not insurance. Insurance prices expected damage; bonds size against the temptation to misbehave, which is set by what the agent could do, not what it usually does. Sizing against expected damage produces bonds too small to deter the worst scenarios.

What if my use case has very low damage potential? Then your BFC floor will be low, and small bonds will be adequate. The framework scales: low-stakes use cases produce low floors, high-stakes use cases produce high floors. Both are appropriate to their context.

Can a single agent serve multiple use cases with different bond requirements? Yes, but the bond should be sized to the highest-stakes use case the agent serves. Splitting bonds across use cases is operationally complex and creates allocation disputes when incidents occur. Single-bond sizing to the worst case is simpler and more honest.

Who underwrites bonds when third-party capital providers do? In the mature market, specialized underwriters analogous to surety insurance underwriters. Currently, agent providers post their own bonds. Both arrangements are workable; the third-party model scales better but requires market infrastructure that is still developing.

What happens when an agent causes damage greater than its bond? The bond pays out up to its size. Damage above the bond is operator loss. This is the structural reason bonds need to be adequately sized: an inadequately bonded agent shifts the tail risk to the operator. Operators who integrate sub-floor bonded agents are accepting this tail risk explicitly.

Can bonds be replenished after partial forfeiture? Yes, and they should be required to. After any partial forfeiture, the agent's bond should be topped back up to the BFC floor before the agent can resume serving consequential workflows. Armalo's escrow contracts support this and the certification tier is degraded until replenishment occurs.

How does bond sizing interact with the multi-LLM jury verdicts? Jury verdicts trigger forfeiture in cases of pact violation. The bond is the economic backstop for the trust commitment that the jury enforces. Together they form a complete trust enforcement system: jury verdicts establish that violation occurred, bonds provide the remedy.

Should bonds vary by counterparty rather than by use case? Yes. Different counterparties have different risk preferences and damage exposures. The same agent operating for two different counterparties may need different bond sizing, with the higher of the two being the operational bond. This is a refinement of the use-case-based approach and one that mature markets will probably implement.

Bottom Line

A bond is a commitment device that aligns the bonded party's incentives with the bondholder's interests by putting the bonded party's funds at risk in proportion to the harm they could cause. Bonds work when sized appropriately. Bonds do not work when sized below the floor of one day's worst-case damage potential. Most current agent bonds are below this floor, which means most current bonded agents are not actually bonded in any operationally meaningful sense; they are theatrically bonded, and operators relying on them are systematically underprotected. Use the Bond Floor Calculator to compute the minimum viable bond for your use cases, evaluate posted bonds against the floor, and treat sub-floor bonds as no bond at all in your procurement decisions. The agent economy will not develop a real bond market until operators demand adequately sized bonds and providers post them. The work is institutional and market-building, and it starts with refusing to confuse theater for substance.

Free downloadNo credit card · Save as PDF

The Agent Liability Pact Template

A pact + bond template that turns "the agent will not do X" into something a counterparty can actually collect on if it does.

Pact conditions wired to verifiable evidence — not vibes
Bond sizing table by agent autonomy level and counterparty value
Payout trigger language modeled on standard ISDA exception clauses
Insurer-ready evidence pack: scorecard, recurring eval, and audit chain

Pro checkout

Turn this trust model into a scored agent.

Start with a 14-day Pro trial, register a starter agent, and get a measurable score before you wire a production endpoint.

Start Pro on Stripe Compare plans

escrowbondseconomicsagent-safetyskin-in-the-gamerisk-pricingminimum-viable-bond

← Back to Blog

Put the trust layer to work

Explore the docs, register an agent, or start shaping a pact that turns these trust ideas into production evidence.

Read the docs Start building

Comments

No comments yet. Be the first to share your thoughts.

Loading comments…

The Escrow Floor: Why Bond Sizing Below One Day's Damage Means No Bond

Turn this trust model into a scored agent.

TL;DR

The Bond That Was Not A Bond

What A Bond Is, Economically

Damage Potential As The Sizing Anchor

Why The One-Day Worst-Case Floor

The Bond Floor Calculator (Reader Artifact)

The Theater Bond Problem

What A Mature Bond Market Looks Like

Counter-Argument: Bond Floors Will Strangle Innovation

What Armalo Does

FAQ

Bottom Line

The Agent Liability Pact Template

Turn this trust model into a scored agent.

Put the trust layer to work

Comments

Leave a comment

Related Posts

The AI Agent Blast Radius Budget

Universal Cart Will Make Procurement Policy Runtime

Agent Payments Need Recourse, Not Just Authorization