Latent Pacts: The Constraints Agents Inherit From Their Runtime, Skills, And Tools
Every agent signs a declared pact. Every agent also inherits a latent pact from its runtime, skills, and tools. The gap between the two is where most production failures live.
Continue the reading path
Topic hub
Behavioral ContractsThis page is routed through Armalo's metadata-defined behavioral contracts hub rather than a loose category bucket.
Turn this trust model into a scored agent.
Start with a 14-day Pro trial, register a starter agent, and get a measurable score before you wire a production endpoint.
TL;DR
An agent's declared pact is the contract it signed. Its latent pact is everything its runtime, skills, and tool stack quietly enforce on top β sandbox limits, capability scopes, MCP server permissions, model context windows, rate limits, and timeout budgets. The two are almost never identical. Most production agent failures are not declared-pact violations. They are conflicts between the two pacts: the declared pact promises a behavior the latent pact prevents, or the latent pact permits a behavior the declared pact never anticipated. This post maps the inheritance chain, the conflict surface, and how to audit it. Reader artifact: a Latent Pact Discovery Audit you can run against any agent in production today.
Intro
The first time we caught a latent pact eating a declared pact, the symptom was a 14% drop in pact compliance score for an otherwise well-behaved support agent. The agent had been running clean for six weeks. Its declared pact promised it would respond to refund requests within 30 seconds and would always cite a knowledge base article when refusing. Suddenly, half its refusals shipped without citations, and its p99 latency had quietly drifted past 45 seconds.
There had been no prompt change. No model change. No skill change. The agent's developer was, understandably, baffled. The pact compliance dashboard insisted the agent was misbehaving. The developer insisted nothing had changed. They were both right.
What had changed was the runtime. A platform engineer had tightened the MCP server's per-call timeout from 8 seconds to 4 seconds as part of a cost-control initiative. The knowledge-base lookup tool, which the agent depended on to fetch article citations, started timing out under load. When the tool timed out, the agent β bound by its declared pact to respond within 30 seconds β returned the refusal without the citation rather than retry. The refusal-without-citation pattern looked, statistically, exactly like a pact violation. Because, definitionally, it was one.
Nothing in the declared pact had changed. Everything in the latent pact had. And the latent pact won, because the latent pact always wins. It is enforced at the layer below the agent's reasoning, by infrastructure the agent cannot see and cannot override.
This is the central, under-appreciated truth of agent governance: an agent's actual behavior is a function of two pacts. The first is the one it signed β the declared, version-controlled, jury-evaluated, scored pact. The second is the one it inherited from every runtime, skill, tool, and capability boundary in its execution stack. The first is auditable. The second is usually invisible. The conflicts between them are where most production failures live, and they are nearly impossible to debug without a model for what those latent pacts actually contain.
This post is that model. We will walk the inheritance chain top to bottom. We will name the conflict surface where declared and latent pacts collide. We will define the discovery process β the Latent Pact Discovery Audit β that surfaces the inheritance for any agent in production. And we will look at what changes about pact engineering when you treat the latent pact as a first-class artifact rather than as ambient infrastructure noise.
What A Latent Pact Actually Is
A latent pact is the set of behavioral constraints that an agent inherits from the substrate it runs on, without ever explicitly agreeing to them, and usually without being aware of them. The declared pact is what the agent's developer wrote in a document, registered with the trust layer, and signed. The latent pact is what the runtime, the skill registry, the tool surface, the model provider, and the deployment harness collectively impose underneath.
If the declared pact says "I will respond to all eligible refund requests within 30 seconds," the latent pact says "You have a 28-second outbound HTTP timeout, an 8K-token context window with no automatic truncation, a 60-request-per-minute rate limit on the knowledge base tool, no retry budget, no streaming output capability, a model provider that returns 502 under load 0.3% of the time, and a skill execution sandbox that cannot persist state between calls." The agent's actual behavior is the intersection of those two pacts. And the intersection, in production, is almost never what the declared pact described.
The key property of a latent pact is that it is not negotiated. The declared pact is negotiated β between the agent author, the platform, the counterparty, and the trust layer. There is a moment of consent. The latent pact is inherited at deployment time. There is no consent. The agent inherits whatever the runtime imposes, and the runtime can change underneath the agent without the agent ever knowing. A platform engineer tightening a timeout is, functionally, modifying every agent's pact in production simultaneously. None of those agents agreed to the modification. None of them know it happened.
This is not a bug in agent runtimes. It is the nature of any layered system. Operating systems impose latent contracts on processes. Browsers impose latent contracts on JavaScript. Kubernetes imposes latent contracts on workloads. The difference is that those latent contracts are usually documented, stable, and well-understood, while agent runtime latent pacts are fresh, undocumented, evolving weekly, and almost never surfaced to the agents whose behavior they govern.
The practical consequence is that an agent's pact compliance score reflects the joint outcome of two pacts the agent did not jointly negotiate. When that score drops, the natural assumption is that the agent has changed. Usually, the agent has not. Usually, the latent pact has. Distinguishing the two is a debugging skill that almost no agent operations team has built yet, because the framing has not existed.
The Inheritance Chain Has Five Layers
Latent pacts are not a single contract. They are a stack of contracts, inherited from progressively lower layers of the execution substrate, each adding constraints that the layer above cannot remove. The five layers, in order from closest to the agent to furthest:
Layer 1: The skill capability scope. Each skill the agent uses has a declared capability scope β the set of operations it is allowed to perform, the data it can read, the side effects it can trigger. The skill's scope becomes a latent constraint on the agent. An agent that uses a skill scoped read-only cannot mutate, no matter what its declared pact says about taking actions. This is usually the most visible layer because skill scopes are often well-documented, but the agent rarely combines them in the developer's head into a single capability surface. The aggregate scope of all skills an agent uses is its effective action surface, and that surface is almost always smaller than the declared pact assumes.
Layer 2: The MCP server permissions. The Model Context Protocol server the agent connects to imposes a tool surface and a permission model. Tools may be exposed as read-only, write-with-audit, write-with-approval, or scoped to specific resource types. The MCP server may also impose rate limits per tool, per session, or per organization. These permissions are latent because the agent sees the tool in its toolset and assumes the tool works. It does not see the permission boundary until a call is rejected.
Layer 3: The runtime sandbox. The execution environment imposes resource limits β memory, CPU, network egress, filesystem access, subprocess spawning, secret access. A sandboxed runtime that disallows outbound network calls effectively voids any pact predicate that depends on external data. A sandbox with a 10-second wall-clock timeout effectively voids any pact predicate that promises a multi-step plan. These constraints are usually invisible to the agent until they are hit, at which point the agent simply fails or times out without understanding why.
Layer 4: The model provider's behavior envelope. The LLM serving the agent has its own latent pact: context window size, token rate limits, refusal patterns trained into the model, response streaming behavior, tool-use schema, structured-output reliability, multi-turn coherence under context pressure. A pact predicate that says "the agent will produce JSON in this exact schema" is constrained by whether the model reliably produces structured output under load. A pact predicate that says "the agent will refuse to discuss self-harm" is constrained by whether the model already refuses to, in which case the predicate is redundant, or whether the model does not, in which case the predicate is unenforceable from outside the model.
Layer 5: The deployment harness and infrastructure. The bottom layer is the infrastructure the runtime itself runs on: load balancers, queue depth, autoscaling behavior, deployment rollouts, regional failover. These constraints surface as availability, latency tail behavior, and consistency under failure. A pact predicate that promises a certain latency p99 is constrained by whether the deployment harness can sustain that p99 under realistic load patterns and what happens when an upstream dependency degrades.
These five layers compose. The agent's actual behavior is bounded by the most restrictive constraint at each layer. A declared pact that violates any layer's constraint will produce an apparent violation at the trust-layer score, even though the agent itself did nothing wrong. The agent is enforcing the latent pact correctly. The latent pact and the declared pact disagreed. The latent pact won.
The Conflict Surface Where Declared And Latent Pacts Collide
Not every layer-one-through-five constraint will conflict with a declared pact. Most do not. The conflict surface is the specific set of declared-pact predicates that depend on a runtime capability the latent pact restricts in a way the predicate does not anticipate. Mapping the conflict surface is the central practical work of pact engineering.
Four conflict patterns appear repeatedly in production:
The latency conflict. The declared pact promises a response within N seconds. The latent pact aggregates timeouts across the runtime sandbox, the MCP server, the model provider, and the network path. The aggregated worst-case latency exceeds N. Under load, the latent pact wins, the agent times out, and the trust layer scores it as a reliability violation. The declared pact never specified what to do under timeout. The runtime defaults β which the agent did not author β fill in the gap.
The capability conflict. The declared pact promises a behavior that requires a tool capability the agent's MCP server does not expose, or exposes only behind an approval gate. The agent attempts the behavior, the tool call fails with a permission error, and the agent β trained to recover gracefully β produces a degraded response. The trust layer scores the degraded response. The declared pact's promise is silently relaxed in production.
The scope conflict. The declared pact promises the agent will perform an action only for a specific class of users, or only over a specific data domain. The latent pact, via the runtime's identity and authorization layer, may have a coarser-grained scope than the pact's predicate. The agent, asked by a user outside the intended scope, performs the action because the runtime authorized the call. The trust layer scores the resulting behavior as a scope violation. The agent did exactly what the runtime allowed.
The state conflict. The declared pact promises a multi-turn behavior β "the agent will not contradict itself across turns," or "the agent will remember context X for the duration of the session." The latent pact, via session storage limits, context window pressure, or stateless skill execution, may not preserve the state required to satisfy the predicate. The agent contradicts itself, not because it chose to, but because it could not see its prior turn. The trust layer scores the contradiction.
These conflicts share a structural feature: they all manifest as agent failures, but they are all infrastructure failures. The agent is the proximate cause. The runtime is the root cause. Without a model for the latent pact, the agent's developer will spend weeks rewriting prompts, switching models, and adding evals β none of which will fix the problem, because the problem is not in the agent.
The Discovery Pattern: Walk Down, Then Walk Up
Discovering the latent pact for an agent in production requires a structured walk through the inheritance chain. The pattern is to walk down from the declared pact to the deepest infrastructure layer, then walk back up annotating each layer's constraints, then compare the annotated stack against each pact predicate to identify conflicts.
The walk-down phase enumerates, for each predicate in the declared pact, every runtime capability the predicate depends on. "The agent will respond within 30 seconds" depends on: the model provider's latency, the model's token output rate, the MCP server's tool latencies for any tools the agent calls, the runtime sandbox's wall-clock budget, the load balancer's request timeout, and the network path's tail latency. Each dependency is a layer of the stack. Each layer is a potential constraint.
The walk-up phase annotates each layer with its actual measured or documented constraint. Model provider latency p99 over the last 30 days. MCP server timeout configuration. Runtime sandbox wall-clock budget. Load balancer timeout. Network p99. The annotation should be sourced from telemetry, not from documentation. Documentation lags reality. Telemetry leads reality. A pact engineer relying on documented timeouts is engineering against a pact that no longer exists.
The comparison phase walks the predicate-to-stack mapping and identifies any layer whose constraint is more restrictive than the predicate requires. Those are conflicts. They may be active conflicts (the predicate is currently being violated) or latent conflicts (the predicate has not yet been violated, but a small change in load or configuration will trigger a violation). Both matter. Both should be in the audit output.
The walk produces a structured artifact: for each predicate, a table of dependencies, the latent constraint at each dependency, and the conflict status. This artifact is the input to two downstream actions. The first is pact remediation β rewriting the predicate to be consistent with the latent pact, or rewriting the latent pact (where possible) to be consistent with the predicate. The second is pact monitoring β instrumenting the dependencies that approached but did not exceed their constraints, so that the pact engineer is alerted before the latent constraint tightens further.
Most teams skip the walk entirely. They write the declared pact, deploy, and monitor for compliance violations. When violations occur, they tune the agent. The agent becomes increasingly contorted to satisfy a pact that conflicts with its own substrate. Over time, the agent becomes a bundle of workarounds for latent constraints no one ever wrote down. The pact, nominally a contract, becomes a fiction maintained by the constant labor of the agent's author against a substrate that does not cooperate.
Why Skills Are The Most Dangerous Inheritance Layer
Of the five inheritance layers, the skill layer is the most dangerous, because skills are usually authored by third parties and updated independently. An agent's declared pact may be stable for months. The skills the agent depends on may be revised weekly, with capability scope changes that the agent's author never sees.
A skill update that tightens its capability scope β say, a knowledge-base skill that previously returned full article text now returns only summaries β silently relaxes any pact predicate that depended on the full text. A skill update that loosens its capability scope β say, a payments skill that previously required approval for transfers above $100 now requires approval only above $1000 β silently expands the agent's effective action surface. Neither change is communicated to the agent's pact. Both change the agent's behavior.
The danger is multiplied when an agent uses many skills, each from a different author, each with its own update cadence. The aggregate latent pact across skills shifts continuously, and no single party owns the aggregate. The agent's author owns the declared pact and assumes the skill scopes are stable. The skill authors own their individual scopes and assume the agent will adapt. The platform owns the registry and assumes both parties are coordinating. None of them are.
The mitigation is to treat skill capability scopes as part of the agent's pact dependency graph and to subscribe to scope changes the way a software project subscribes to dependency updates. When a skill's scope changes, the change should trigger an automatic re-walk of the agent's pact predicates against the new scope. Predicates that newly conflict should be flagged. Predicates that newly have headroom should be flagged too β they may be candidates for tightening, or for additional capability.
This turns skill scope management into a pact-engineering problem. It is, today, almost universally treated as a platform-engineering problem, with the consequence that the agent's pact silently rots underneath it. The first team to operationalize skill-scope-as-pact-dependency will have a substantial reliability advantage, and a substantial trust-score advantage on the systems that score them.
The Tool Surface Is Not What The Agent Thinks It Is
MCP tool surfaces present another inheritance hazard, distinct from skills. An MCP server exposes a set of tools, each with a name, a description, and a schema. The agent's reasoning treats each tool as available for use whenever its description matches the task. The latent pact is more nuanced: each tool may be available only under specific conditions β the calling user's identity, the resource being accessed, the rate limit budget, the time of day, the regional configuration.
An agent that calls a tool successfully ten thousand times will, on the ten-thousand-and-first call, hit a permission boundary it had no way to anticipate. This is not a bug in the agent. It is a consequence of the tool surface being a function of context, not a static interface. The agent sees the static interface. The latent pact is the function.
The practical implication is that pact predicates that depend on tool availability should be conditioned on the same context that conditions the tool. "The agent will execute the refund within 60 seconds" should be expanded to "The agent will execute the refund within 60 seconds, provided the user has refund-eligible status, the refund amount is under the auto-approval threshold for the user's tier, the rate-limit budget is sufficient, and the regional payments provider is available." The expansion is not pedantry. It is the actual contract. The unexpanded version is the marketing copy.
Most pacts are written in marketing copy. They promise behaviors without enumerating the conditions under which the behaviors are guaranteed. The trust layer scores the marketing copy. The agent operates under the actual contract. The gap between the two is the gap between the agent's reputation and the agent's reality. Closing it is one of the highest-leverage moves a pact engineer can make, and it requires confronting the latent pact head-on.
The Runtime Sandbox Is The Floor
The runtime sandbox imposes the floor of what the agent can do. Above the floor, capability is negotiable. Below the floor, capability is impossible. Pact predicates that depend on capability below the floor are unenforceable, regardless of what the trust layer scores them as.
The most common floor-violating predicates fall into three buckets. The first is persistent state predicates β "the agent will remember user preferences across sessions" β which require persistent storage the sandbox may not provide. The second is multi-step planning predicates β "the agent will execute a five-step workflow within the conversation" β which require wall-clock time the sandbox may not allow. The third is external integration predicates β "the agent will fetch the latest data from system X" β which require network egress the sandbox may not permit.
When a predicate violates the floor, the trust layer will score the resulting failures as agent failures. The agent will appear unreliable. Its scores will degrade. Its certification tier may drop. The agent's author will be paged. The agent will be retrained, reprompted, reconfigured. None of it will help. The floor is the floor.
The only resolution is to either change the predicate to be consistent with the floor or change the floor to support the predicate. The first is a pact-engineering action. The second is an infrastructure action. Both require recognizing that the floor exists and that it is the binding constraint, which in turn requires the inheritance walk we described above.
A disciplined pact engineering practice maintains an explicit "floor inventory" for each runtime its agents deploy to. The inventory enumerates the sandbox's capabilities and their limits: memory, wall-clock, persistent storage, network egress, subprocess spawning, secret access, filesystem access, GPU access, multi-region access. Every pact authored against the runtime is checked against the inventory before it is deployed. Predicates that violate the inventory are either rewritten or escalated for infrastructure work. The pact-deployment pipeline acts as a static analyzer that prevents impossible promises from reaching production.
The Model's Latent Pact Is The Hardest To Change
The LLM serving the agent has its own latent pact, encoded in its training and its alignment. This pact is the hardest to change because it is opaque, distributional, and outside the agent author's control. The model will refuse some requests, hedge on others, produce structured output reliably under some conditions and unreliably under others, and exhibit subtle behavior shifts when the model provider releases an update.
A pact predicate that promises a behavior the model resists β "the agent will provide a definitive answer to every question" against a model trained to express uncertainty β will produce constant violations. A pact predicate that promises a behavior the model already provides β "the agent will refuse to discuss illegal activity" against a model that already refuses β is redundant and provides no incremental value.
The pact engineer's job is to characterize the model's latent pact empirically and to author predicates that complement it rather than fight it. This characterization is best done as a permanent benchmark suite that runs against the model under controlled conditions: a battery of prompts designed to surface the model's refusal patterns, its uncertainty expressions, its structured-output reliability under context pressure, its tool-use accuracy, its consistency across turns. The benchmark output is the model's latent pact specification, and it should be versioned alongside the agent's declared pact.
When the model provider issues an update, the benchmark suite should run automatically and surface deltas. A model update that tightens refusal patterns may invalidate predicates the agent had been satisfying. A model update that loosens structured-output reliability may invalidate predicates the agent's downstream consumers depend on. The latent pact has shifted. The declared pact must shift with it, or the trust layer will score the misalignment as the agent's failure.
This is the layer where the pact engineer has the least leverage. The agent's author cannot change the model. They can switch models, but switching is expensive and slow. Most often, the right move is to absorb the model's latent pact into the declared pact β rewriting predicates to be consistent with what the model actually does β and to monitor the model continuously for drift. The agent's pact, in this view, is partially a transcription of the model's behavior, not a constraint on it.
A Worked Example: The Refund Agent
Return to the refund agent we opened with. Its declared pact had three predicates: respond within 30 seconds, cite a knowledge-base article when refusing, and never approve a refund above $250 without supervisor escalation. After six clean weeks, its compliance score dropped 14% in one week, with no agent change.
The inheritance walk surfaced the cause. Predicate one, latency, depended on the model provider, the MCP server's knowledge-base tool, and the runtime sandbox's wall-clock budget. The platform engineer's timeout tightening had compressed the MCP server's tool latency budget from 8 seconds to 4 seconds. The knowledge-base tool's p95 latency was 5.3 seconds. Under load, the tool was timing out half the time. Predicate two, citations, depended on the knowledge-base tool succeeding. When the tool timed out, the agent β bound by predicate one β returned the refusal without the citation rather than retry. Predicate one and predicate two were in latent conflict. The latent pact had silently chosen which one to satisfy.
The remediation had three parts. First, the agent's pact was rewritten to make the conflict explicit: "respond within 30 seconds, provided the knowledge-base tool responds within 6 seconds; if the tool exceeds 6 seconds, retry once before responding without citation; if the response without citation occurs, log the dependency failure as a runtime constraint, not as a pact violation." Second, the trust layer's scoring rule was updated to not penalize citation absence when the dependency-failure log was present. Third, the runtime team was paged with the conflict surface so the timeout decision could be made jointly with the agent team rather than unilaterally.
The agent's compliance score returned to its prior level within three days. The agent itself did not change. The pact changed, the scoring rule changed, the runtime conversation changed. The latent pact became visible. Once visible, it was negotiable. Before it was visible, it was destiny.
The Reader Artifact: The Latent Pact Discovery Audit
The artifact this post produces is a Latent Pact Discovery Audit. It is a structured walkthrough you can run against any agent in production today. It produces, for each predicate in the agent's declared pact, an enumeration of the latent constraints that bind the predicate, the source of each constraint, the current measured value of each constraint, the headroom between the predicate's promise and the constraint's reality, and the conflict status. The audit is intended to be re-runnable on a schedule β weekly for actively-evolving agents, monthly for stable ones β and to produce diffs that flag drift.
The audit has six sections.
Section one is the predicate inventory. Every predicate in the declared pact is listed, with its full text, its category (latency, capability, scope, state, output-format, refusal, escalation, other), and its compliance score over the last 30 days. The inventory is the spine of the audit. Every subsequent section maps to predicates in the inventory.
Section two is the dependency map. For each predicate, the runtime capabilities it depends on are enumerated. Capabilities are described in concrete terms β not "depends on the model," but "depends on the model's structured-output reliability for tool calls of arity three or higher under context window above 6K tokens." Specificity here is what makes the audit actionable. Vague dependencies cannot be measured. Specific ones can.
Section three is the latent constraint enumeration. For each dependency, the constraint is documented with its current measured value, its source (telemetry, configuration, documentation, or model card), and its volatility (how often the value has changed in the last 90 days). Volatility is the leading indicator of pact risk. A high-volatility dependency is one whose latent pact is likely to shift, and predicates that depend on it are likely to break.
Section four is the headroom analysis. For each predicate-dependency pair, the headroom between the predicate's promise and the dependency's measured value is computed. "Predicate promises 30-second response. Dependency measures 18-second p99 contribution. Headroom: 12 seconds, or 40%." Negative headroom is an active conflict. Headroom under 20% is a near-conflict. Both should be flagged.
Section five is the conflict register. Active conflicts are listed first, with their proposed remediation: rewrite the predicate, change the runtime configuration, retire the predicate, or accept the violation and update the trust-layer scoring rule. Near-conflicts follow, with their early-warning instrumentation: what telemetry should be alerted on, at what threshold, to whom.
Section six is the change subscription. The audit identifies which dependencies the agent should actively subscribe to changes on β skill capability scopes, MCP server permissions, runtime sandbox configuration, model provider release notes β and how the subscription is operationalized. Without subscriptions, the audit is a snapshot. With subscriptions, the audit becomes a living document that is updated as the latent pact shifts.
The audit is delivered as a markdown document, a YAML manifest, or a row-per-predicate table in the agent's governance repo. The format matters less than the cadence. An audit run once is a one-time housekeeping exercise. An audit run weekly becomes the operating substrate for pact engineering.
Counter-Argument: This Is Just Operations Work With A New Name
A reasonable objection is that everything in this post is just operations work β capacity planning, dependency monitoring, runbook discipline β dressed up in pact-engineering vocabulary. The same outcomes could be reached without the pact framing. Just monitor the dependencies. Just catch the conflicts in the runbook.
The objection is partially right. The mechanisms are familiar from operations practice. What is new is the framing that makes the mechanisms applicable to behavioral contracts rather than to availability or latency targets. Operations practice for service reliability is mature. Operations practice for behavioral compliance is nearly nonexistent. The pact framing is what bridges the two.
The framing matters because it forces a particular question β "is this a declared pact violation or a latent pact conflict?" β that operations practice does not naturally ask. Without the framing, every behavioral failure is attributed to the agent. With the framing, behavioral failures are partitioned into agent failures and substrate failures, and each is routed to the team that can actually fix it. The routing is the leverage. Operations practice without the framing tends to either ignore behavioral failures or dump them all on the agent author. Pact engineering with the framing tends to produce joint resolution by the agent team and the platform team.
There is a deeper version of the objection: that the latent pact is just infrastructure documentation, and the discipline of documenting infrastructure is not a new discipline. This is also partially right. The novelty is not the documentation. The novelty is the connection between the documentation and the behavioral contract β the explicit recognition that infrastructure constraints are constraints on behavior, scored as behavior, and therefore must be in the same governance loop as the behavioral pact.
If you are running an agent operations team that already does this β that already maintains a floor inventory, subscribes to dependency changes, and walks pacts against constraints on a weekly cadence β then this post will read as a restatement of your practice. You are in the small minority. For most teams, the pact framing is the missing piece that turns scattered operational practices into a coherent discipline.
What Armalo Does
Armalo's trust layer is built on the assumption that an agent's actual behavior is a function of both its declared pact and its latent pact. Every agent's pact is registered with explicit predicate categories and explicit dependency declarations. The trust oracle's scoring engine consumes the dependency declarations and conditions its scoring on dependency status β a citation absence flagged with a dependency-failure log is not scored as a pact violation, while an unflagged citation absence is.
The twelve-dimensional composite score includes runtime-compliance (5%) and harness-stability (5%) precisely because the substrate matters as much as the agent. Runtime-compliance scores how often the agent's behavior aligns with its declared runtime constraints. Harness-stability scores how stable the agent's substrate has been over the scoring window. An agent on a stable substrate with high runtime-compliance has a meaningfully different trust profile than an agent with the same surface behavior on a volatile substrate, and the composite reflects that.
The pact registry exposes a dependency-graph endpoint that returns, for any agent, the full inheritance chain β skills, MCP servers, runtime sandbox profile, model provider, deployment harness β with the latent constraints attached. This is the input to the Latent Pact Discovery Audit, and it is queryable via the trust oracle for any agent the requesting party is authorized to inspect. Counterparties evaluating an agent for a deal can see not just the declared pact but the substrate it depends on, and can make their own judgment about latent risk.
FAQ
How is a latent pact different from a service-level objective (SLO)? An SLO is a unilateral commitment by the substrate provider about availability or latency. A latent pact is the full set of behavioral constraints the substrate imposes on agents that run on it, including but not limited to availability and latency. SLOs are a subset of the latent pact, written from the provider's perspective rather than from the agent's perspective.
Should the latent pact be auto-derived from runtime telemetry? Yes, where possible. Documentation lags reality. Telemetry leads it. The most reliable latent pact specification is one that is automatically refreshed from production telemetry, with documentation as a fallback for capabilities that telemetry cannot observe (such as model refusal patterns under specific conditions).
What happens when the latent pact and the declared pact cannot be reconciled? Either the declared pact must change to be consistent with the latent pact, or the latent pact must change (via runtime configuration, tool surface adjustments, or skill capability changes) to support the declared pact. The pact engineer's job is to surface the conflict, propose both options, and route the decision to the joint team that owns both.
Does this apply to single-prompt agents, or only to multi-tool agents? It applies to both, but the surface is smaller for single-prompt agents. A single-prompt agent inherits the model's latent pact and the runtime's I/O constraints, but it does not inherit a skill scope or an MCP server permission model. Its inheritance chain is shorter, but the model's latent pact remains the dominant influence on its behavior.
How often should the Latent Pact Discovery Audit run? Weekly for production agents on volatile substrates. Monthly for production agents on stable substrates. After every substrate change for any affected agent. After every declared-pact change for the affected agent. The cadence should match the rate at which the inheritance chain mutates.
Who owns the latent pact? No single party owns it. The substrate provider owns the constraints. The agent author owns the predicates that depend on the constraints. The platform owns the registry that connects them. Pact engineering practice should establish a joint governance forum where changes to either the declared pact or the latent pact are reviewed by both parties.
Can the latent pact be cryptographically attested? Parts of it can. Runtime sandbox profiles can be attested by the platform. Skill capability scopes can be attested by skill authors. Model provider behavior cannot be cryptographically attested, but model provider release versions can be. The portions that can be attested should be, because attestation is what makes the audit trustworthy across organizational boundaries.
What is the simplest first step? Pick one production agent. List its declared pact predicates. For each predicate, write down the runtime capabilities it depends on. For each capability, write down the constraint that bounds it. For each constraint, write down the headroom. The result will surprise you. It will surprise the agent's author. It will surprise the platform team. It will produce more concrete pact-engineering work than any other single hour you can spend.
Bottom Line
Agents do not behave according to the pacts they sign. They behave according to the joint outcome of the pact they sign and the pact they inherit. The inherited pact is invisible by default. The conflicts between the two pacts are where most production failures live. The conflicts are not detectable without an explicit walk through the inheritance chain β skill capability scopes, MCP server permissions, runtime sandbox limits, model provider behavior, deployment harness configuration. The Latent Pact Discovery Audit is the structured walk that surfaces the inheritance, identifies the conflicts, and produces remediation work routed to the team that can actually act on it. Without the audit, agents are blamed for substrate failures, scores degrade for reasons no one can explain, and pact engineering becomes the labor of contorting agents to satisfy contracts the substrate refuses to honor. With the audit, the latent pact becomes a first-class artifact, governed jointly by the agent team and the platform team, and the trust layer's scoring becomes a faithful measure of the agent's actual behavior. The shift is small in mechanism and large in consequence. It is what separates pact engineering as a discipline from pact authorship as a one-time event.
The Trust Score Readiness Checklist
A 30-point checklist for getting an agent from prototype to a defensible trust score. No fluff.
- 12-dimension scoring readiness β what you need before evals run
- Common reasons agents score under 70 (and how to fix them)
- A reusable pact template you can fork
- Pre-launch audit sheet you can hand to your security team
Turn this trust model into a scored agent.
Start with a 14-day Pro trial, register a starter agent, and get a measurable score before you wire a production endpoint.
Put the trust layer to work
Explore the docs, register an agent, or start shaping a pact that turns these trust ideas into production evidence.
Comments
Loading commentsβ¦