Pacts Are Not Documentation: Where The Cryptographic Boundary Actually Lives
A PDF describing how an agent should behave is not a pact. It is a wish. Pacts are signed cryptographic commitments enforced at runtime, and that distinction decides whether your agent economy has teeth or vibes.
Continue the reading path
Topic hub
Behavioral ContractsThis page is routed through Armalo's metadata-defined behavioral contracts hub rather than a loose category bucket.
Turn this trust model into a scored agent.
Start with a 14-day Pro trial, register a starter agent, and get a measurable score before you wire a production endpoint.
TL;DR
Most "agent policies" in production right now are PDFs, Notion pages, system prompts, or markdown files that nobody enforces and nobody can verify. They are documentation. A pact is something different: a signed commitment about specific behavior, bound to specific evidence, enforced at three concrete boundaries (admission, runtime, post-hoc), and backed by economic penalty when violated. This essay draws the line between the two, explains why documentation cannot do the work people expect it to, and details the three boundaries where pacts actually live. The artifact at the end is a Doc-vs-Pact Conversion Worksheet you can run on your own agent's existing policy text.
The agent policy PDF that nobody reads
If you ask a team running an agent in production today to show you their behavioral policy, you will get one of three artifacts back. The first is a Notion page that was written six weeks ago, has been edited twice, and lists fifteen things the agent will and will not do. The second is a section of a system prompt that lives inside the agent's runtime configuration and contains a list of imperative sentences like "never disclose customer PII" and "always confirm dollar amounts above one thousand dollars before executing." The third, and the most common, is no artifact at all β there is a vague mental model held by the team lead about what the agent is supposed to do, and a vaguer one held by everyone who works with that lead.
None of these are pacts. They are wishes. They share four structural failures that make them unable to do the work that people quietly expect them to do.
First, none of them are signed. There is no cryptographic anchor that ties the document to a specific version of the agent at a specific point in time. If the document changes, there is no audit trail of who changed it or what the agent was promising before. If the agent changes, there is no link back to the version of the document the agent was operating under. Six months later, when something has gone wrong and a buyer is asking which version of the policy was in force when their incident occurred, the team is reduced to checking git history and hoping nobody force-pushed.
Second, none of them have a runtime enforcement path. The Notion page does not stop the agent from doing the thing the page says it will not do. The system prompt suggests but does not constrain β modern LLMs ignore prompt instructions a meaningful percentage of the time, and in adversarial conditions that percentage climbs. The mental model in the team lead's head obviously stops nothing. So when the agent does the forbidden thing, there is no defense in depth; there is just the documentation pointing at the burning building.
Third, none of them have evidence-grade compliance signals. There is no way to look at the agent's last thousand interactions and answer the question "on what fraction of these did the agent honor the policy?" with a number that you would defend in court. People will produce numbers; they will not be defensible. The signals are either absent or so noisy and unstructured that they cannot survive contact with an adversarial reader.
Fourth, none of them have a penalty when violated. A documentation violation has no economic consequence, no reputational decay, no slashing, no automatic operational pause. It has, at most, an embarrassed Slack thread. Without consequence, the document does not bind the agent's incentives; it only describes the team's preferences.
These four failures together produce a recognizable pattern. Teams have lots of policy documentation, and they routinely discover that the documentation did not stop any of the things the documentation was written to stop. Then they write more documentation.
The pact as a cryptographic object, not a prose object
A pact is a different kind of artifact. It is a structured object β five fields at the minimum, which we cover in detail in a companion essay β that has been signed by both the agent operator and the counterparty (the platform, the buyer, the marketplace, whoever is on the other side of the commitment). The signature anchors the pact to a specific moment in time and a specific agent identity. The pact is not edited; it is versioned. New versions are signed, old versions are retained, and the agent's record carries the trail of which pacts it operated under in which periods.
This is the first concrete difference. A pact is something you can quote in a dispute. A document is something you have to defend the provenance of before you can quote it.
The second concrete difference is that a pact is structured for machine enforcement. The fields β subject, predicate, evidence, penalty, renewal β are not narrative; they are typed. Subject is a counterparty identifier. Predicate is a behavior in a form precise enough to be evaluated mechanically. Evidence is an explicit specification of what telemetry, what logs, what jury verdicts will count as proof of compliance. Penalty is a specific economic or reputational consequence. Renewal is a defined expiry condition. Every one of these can be parsed by software at admission time, monitored by software at runtime, and adjudicated by software (or a human jury) after the fact.
The third concrete difference is that a pact has weight on the agent's reputation graph. A pact violation is not just a Slack moment; it is a score-mover. It feeds into the composite score, the reputation score, the certification tier. Repeated violations produce decay, demotion, slashed bonds, and lost deals. The pact is not just a description of behavior β it is a piece of the agent's economic identity. Operators who violate pacts pay; operators who honor them accrue.
This is why "pact" is the right word. The connotation matters. A pact is an agreement that binds two parties under penalty. A document is an agreement that exists at the convenience of one party.
The three boundaries pacts actually live at
Documentation lives in one place: the document. Pacts live in three. Each boundary plays a distinct role, and a pact that is missing any one of the three is doing only part of the job.
The first boundary is admission control. Before the agent is allowed to take on the counterparty, the pact has to be signed by both sides and the signing has to be recorded. This is enforced at the platform layer β the marketplace, the deal flow, the dispatcher. An agent without a valid pact for the counterparty's class cannot even be offered the work. This sounds bureaucratic; in practice it is the cheapest defense in the entire stack, because it stops the entire category of failure where the agent silently picks up work it was never supposed to do, and there is no record of the agreement under which it was working.
The second boundary is runtime guardrail. Once the work is in flight, the pact's predicate is monitored against the agent's actual behavior in real time. This does not mean the agent is paused before every action; it means there is a layer between the agent and its outputs that knows what the pact requires and can intervene when the agent is about to violate it. Runtime guardrails are not perfect β they cannot stop every violation, particularly subtle ones β but they catch the gross ones, which is most of them. They also produce a second source of evidence beyond the agent's own logs.
The third boundary is post-hoc evidence. After the work is complete, the pact's evidence specification is run against the recorded interaction. The multi-LLM jury reviews the trace, the runtime logs, the outputs, and produces a verdict on whether the agent honored the predicate. The verdict is a public artifact. It feeds into the agent's score, the counterparty's confidence, and any dispute that may follow.
These three boundaries are not redundant; they catch different failure modes. Admission control catches the "why are you even doing this work" failure. Runtime guardrails catch the "you are about to violate the pact right now" failure. Post-hoc evidence catches the "you violated the pact and only the trace shows it" failure. A pact that operates at all three boundaries is structurally harder to violate without being caught than a pact that operates at only one or two.
The boundary that documentation can reach is none of these. A PDF cannot stop admission, cannot intervene at runtime, and cannot grade post-hoc evidence. It can describe what the team wishes were happening at all three boundaries, and that is the gap between documentation and pact.
Admission control: the cheapest pact enforcement layer
Most teams underweight admission control because it feels procedural. They think the real defense is at runtime, and admission is just paperwork. This gets the cost-effectiveness analysis exactly backwards.
Admission control is the cheapest place to enforce a pact because the work has not started yet. There is no in-flight transaction to roll back. There is no partial output the buyer has already seen. There is no money that has changed hands. There is just a request β the agent wants to take on a piece of work β and a check: is there a valid pact in place that authorizes this class of work for this counterparty? If yes, the work proceeds. If no, the work is rejected with a structured reason.
The failures admission control catches are the failures that are the easiest to ignore until they cost the most. An agent that takes on work outside its declared scope is the textbook scope-honesty failure. An agent that takes on work for a counterparty that did not consent to the agent's pact terms is the textbook consent failure. An agent that takes on work after its pact has expired is the textbook lapsed-pact failure. None of these failures are exotic. All of them are caught by a five-line check at admission time: does a valid pact exist, is it signed by both parties, is it within its renewal window, and does the requested work fall inside the pact's declared scope.
What documentation cannot do here is gate the request. The PDF is not in the request path. The system prompt may suggest the agent should refuse, but the agent's runtime is not the right place for this enforcement β it is too late, too easy to bypass, and too easy to lose track of which prompt was loaded for which request. Admission control sits in the platform layer where it belongs, and the pact provides the data the platform needs to make the decision.
A further point: admission decisions are themselves signed and logged. Six months later, when the buyer asks why their work was assigned to this agent, there is a structured answer with a pact ID, a signature, a timestamp, and a scope match. This is the audit trail that documentation cannot produce, and it is worth its weight in subsequent dispute resolution.
Runtime guardrails: the layer that catches the live violation
The second boundary is the one teams expect to do most of the work, and it does, but not in the way they think. Runtime guardrails are not a magic system that prevents every misbehavior. They are a layer between the agent and its outputs that knows the pact's predicate well enough to intervene when the agent is about to violate it.
In practice, a runtime guardrail is a small set of inference passes β sometimes deterministic, sometimes a small fast model, sometimes a structured rule check β that runs against the agent's outputs before they leave the system. If the pact says "the agent will never quote a price without a confidence interval," the guardrail checks every pricing output for a confidence interval and either appends one, blocks the output, or routes it to human review. If the pact says "the agent will never execute a transaction above ten thousand dollars without a human approval," the guardrail intercepts every transaction request and checks the dollar amount. If the pact says "the agent will refuse requests outside its declared capability scope," the guardrail runs a fast scope-classification pass on every incoming request and refuses out-of-scope ones with a structured reason.
These checks are not cheap, and not all of them can be applied to all interactions. A high-volume agent with thousands of interactions per minute cannot afford a heavyweight LLM-based guardrail on every output; it can afford a fast deterministic check on most outputs and a heavyweight check on a sampled subset. The pact's evidence specification informs which checks run on which interactions. The point is that there is a layer in the pipeline whose explicit job is to defend the pact, and that layer's behavior is itself logged so that subsequent reviewers can see whether the defense was active and what it caught.
The failure mode of runtime guardrails is the subtle violation. An agent that produces an output that technically passes the guardrail but violates the spirit of the pact will not be caught here. That is fine; the third boundary catches it. Runtime guardrails are a sieve, not a filter. Their job is to catch the gross violations that would otherwise produce headlines, and to do so cheaply enough that they can run on every interaction.
Documentation cannot do this work because documentation is not in the inference path. Even when documentation is loaded into a system prompt, it is operating inside the agent's reasoning, not as an independent check on the agent's outputs. The independent check is what gives the runtime guardrail its defensive value: it is not the agent grading its own work.
Post-hoc evidence: the layer where reputation actually moves
The third boundary is where the pact does its most lasting work. After an interaction completes, the pact's evidence specification runs against the recorded trace. The multi-LLM jury β three or more independent models, with the top and bottom 20% of judgments trimmed to defang outliers and bribed evaluators β reviews the trace and produces a verdict: did the agent honor the predicate, partially honor it, or violate it?
This is where the agent's score actually moves. Admission control gates the work; runtime guardrails defend it; post-hoc evidence judges it. The judgment is a public artifact, signed by the jury, with the reasoning traceable for audit. It feeds into the composite score in proportion to the dimension it touched (a safety violation hits the safety dimension at 11% weight; a reliability failure hits the reliability dimension at 13% weight, and so on). It feeds into the reputation score for the counterparty relationship. It feeds into the agent's certification tier β repeated violations push the agent down from Gold to Silver to Bronze.
The reason this boundary cannot be replaced by documentation is that documentation does not produce evidence. A PDF that says "the agent will be reliable" produces no signal that can be measured against the actual interaction. The pact's evidence specification, by contrast, is a precise schema: what fields of the trace count as evidence, what threshold separates compliance from violation, what auxiliary signals count as supporting context. The jury reads the schema and applies it. The verdict is reproducible β another jury reviewing the same trace under the same schema should produce a substantially similar verdict, and the trimmed-mean structure of the multi-LLM jury makes that reproducibility statistically defensible.
The post-hoc evidence layer is also where dispute resolution lives. If a counterparty believes the agent violated the pact and the jury did not catch it, they can file a dispute. The dispute triggers a re-review with additional context, possibly with human adjudicators. The resolution is itself a signed artifact. This is what gives buyers, regulators, and downstream platforms the ability to trust the score: not because the score is always right on the first pass, but because there is a structured procedure for correcting it when it is wrong, and the corrections are themselves auditable.
Why "system prompt as policy" is a category error
A particularly common failure mode in 2026 is the team that treats their system prompt as if it were a pact. They write detailed behavioral instructions into the prompt, version the prompt in git, and consider the work done. This is a category error that produces three concrete failures.
The first failure is that the system prompt is not enforceable independent of the agent. Modern LLMs ignore prompt instructions in a non-trivial percentage of cases, and the percentage climbs under adversarial input, edge-case input, or input that exceeds the model's effective context window. Treating the prompt as enforcement is treating the agent as its own guardrail, which is the configuration least likely to catch the agent's own misbehavior.
The second failure is that the system prompt is not visible to counterparties. The buyer cannot see the prompt. The marketplace cannot see the prompt. The regulator cannot see the prompt. Whatever the prompt says is private to the operator, and the operator's word that the prompt is in force is the only attestation. This is exactly the conflict-of-interest configuration that public-pact infrastructure exists to defang.
The third failure is that the system prompt is not a stable artifact. It changes when the operator changes it, which is more often than they admit. Without a signed versioning trail, there is no way to ask "what was the prompt in force when this incident occurred?" with a defensible answer.
A pact that incorporates a system prompt is fine β many do. The pact's evidence specification can include "the system prompt in force at interaction time hashes to X." That is a structured, signed, auditable claim. The system prompt by itself, with no pact wrapped around it, is a wish in a config file.
The Doc-vs-Pact Conversion Worksheet
If you have agent policy documentation today and want to know whether you have anything pact-grade hiding inside it, run your existing policy text through this worksheet. It is structured as a series of questions that map each section of the documentation to either a pact field or a deletion.
-
Identify every behavioral promise in the documentation. A behavioral promise is a sentence that describes what the agent will or will not do. Strip out the explanatory prose, the capability descriptions, the historical context. What remains is the candidate set of pact predicates.
-
For each promise, identify the counterparty. Whose interest does this promise serve? The buyer? The platform? A regulator? A third-party integration? If you cannot identify a counterparty, the promise is not pact-eligible β it is a self-statement and it does not bind anything.
-
For each promise, specify the evidence. What data, recorded during the interaction, would prove or disprove this promise? If the answer is "we would have to manually inspect the trace and use judgment," the promise is too vague for a pact. Tighten the promise until the evidence is specifiable in a schema.
-
For each promise, define the penalty. What happens to the agent's score, bond, tier, or operational status if the promise is violated? If the answer is "nothing," the promise has no teeth and is documentation only. Either attach a penalty or accept that this is non-pact text.
-
For each promise, define the renewal cadence. How long does this promise hold? Is it for the lifetime of the agent? For a specific deal? For a quarter? For a single interaction? If the renewal is undefined, the promise drifts.
-
For each promise, define the admission scope. Which work classes does this promise apply to? All work the agent does, or only specific contracts? If the promise is universal, it goes into the agent's standing pact set. If it is contract-specific, it goes into the deal-level pact.
-
For each promise, define the runtime guardrail. Is there a check that can run on the agent's outputs in real time to catch violations of this promise? What is the check? Is it deterministic, model-based, or sampled? Promises that have no possible runtime guardrail rely entirely on post-hoc evidence β that is fine, but you should know it.
-
Delete every line of documentation that does not survive steps 1 through 7. Documentation that does not become a pact is probably documentation that did not need to exist. Marketing language about reliability is fine to keep on a landing page; it should not be in your behavioral policy.
The output of this worksheet is a candidate pact set. The next step is to negotiate it with counterparties, sign it, and wire the three boundaries. That is engineering work, not documentation work.
Migration from documentation to pacts: a six-week sequence
For teams convinced by the argument but daunted by the practical work of converting documentation to pacts, the migration sequence below is what production teams actually do. It is calibrated to a six-week horizon for a team with one to three production agents and existing policy documentation. Larger teams should extend the timeline proportionally; smaller teams can compress it but should not skip stages.
Week one is inventory and triage. Pull every piece of agent policy documentation the team has β Notion pages, system prompts, internal wiki entries, Slack pinned messages. Run each through the Doc-vs-Pact Conversion Worksheet. Identify which sections contain behavioral promises (the candidate Predicates) and which sections are explanatory or marketing. Triage the candidate Predicates by importance: which ones, if violated, would cause the most counterparty harm? Order the list. The top of the list becomes your first pact's Predicate set.
Week two is counterparty alignment. For each top-priority candidate Predicate, identify which counterparty it serves and what they would consider acceptable enforcement. Some Predicates the operator considers important may not matter to any counterparty; deprioritize those. Some Predicates the operator did not write down may matter critically to counterparties; add those. The output of week two is a curated Predicate set for the first pact, validated against actual counterparty interest.
Week three is Evidence and Penalty design. For each Predicate, specify the Evidence schema (data sources, thresholds, schemas) and the Penalty composition (which of the four primitives, calibrated how). This is the most engineering-heavy week because it requires understanding what telemetry the runtime can emit and what bond posture the operator can credibly post. Adjust Predicates if the Evidence is unfeasible or the Penalty is uncalibratable.
Week four is admission and runtime wiring. Build the platform-layer admission control that gates work requests on pact validity. Build the runtime guardrails that operate against the pact's Predicates. This is the boundaries-one-and-two work; the post-hoc evidence boundary is built later because it requires the multi-LLM jury infrastructure that takes longer to stand up.
Week five is signing and dual-run. Sign the new pact with the first counterparty. Run it in dual-run mode against the existing documentation-only path so that any inconsistencies surface before the pact is exclusively authoritative. Watch the dual-run telemetry for cases where the pact catches violations the documentation path missed (good) or flags as violations interactions that the documentation considered acceptable (calibration error to investigate).
Week six is migrate and operate. Cut over to the pact as the authoritative behavioral commitment. Retire the documentation-only path. Begin the post-hoc jury verdicts as the third boundary becomes operational. The operator now has one pact, with one counterparty, fully wired through all three boundaries. Subsequent pacts (other counterparties, other capability surfaces) build on this foundation and take dramatically less time than the first.
The sequence is conservative because the first pact is the hardest. Once the team has built the muscle memory and the infrastructure, subsequent pacts can be drafted and deployed in days rather than weeks. The investment is front-loaded, the payback is permanent, and the alternative β staying with documentation β accumulates risk every day the agent operates.
Counter-argument: "Pacts are heavy; documentation is fast"
The strongest objection to this whole framing is that pacts have substantial overhead. Drafting a pact, negotiating its language with a counterparty, wiring the runtime guardrail, defining the evidence schema, setting up the post-hoc jury β all of this takes engineering time that documentation does not. For an early-stage team trying to ship an agent in days, the documentation path looks faster.
It is faster. It is also fragile in a way that does not show up in the first few weeks of operation. The teams that ship documentation-only agents pay the cost later β when the first incident hits and they cannot reconstruct what they had promised, when the first counterparty asks for an audit trail and they cannot produce one, when the first regulator asks why the agent was allowed to do what it did and they have no answer beyond "the system prompt told it not to."
The pragmatic compromise is to start with a minimal pact β three or four predicates, a simple evidence schema, a basic penalty (score decay only, no slashing yet), and admission control without runtime guardrails β and grow it. A minimal pact is still a pact. It satisfies the structural properties: signed, versioned, evidence-bound, penalty-bound. It can be enforced at all three boundaries with progressively richer machinery as the team's resources allow. The minimum bar for a pact is much lower than the maximum, and the minimum is still meaningfully different from documentation.
The teams that resist pacts on overhead grounds are usually optimizing for the wrong horizon. Pacts pay back the moment the first thing goes wrong, which is sooner than most teams expect.
The cryptographic boundary in operational terms
It is useful to spend a moment on what "cryptographic boundary" actually means in operational terms, because the phrase gets thrown around in pact discussions without enough specificity. The boundary is the line between artifacts that can be tampered with after the fact and artifacts that cannot. Documentation lives on the tampering side: any team member with edit rights to the Notion page can rewrite a behavioral promise tomorrow that contradicts what was promised yesterday, and there is no defense against the rewrite beyond version control hygiene that the team has to enforce manually. Pacts live on the non-tampering side: the signed object's hash is recorded, the signature is verifiable against the operator's and counterparty's public keys, and any change to the pact produces a new signed version that does not invalidate the historical one.
The operational consequence is that pacts produce a trustworthy historical record without requiring anyone to be honest. Documentation requires the team to faithfully preserve old versions, attribute changes correctly, and never selectively edit history; the discipline is human and frequently fails. Pacts make the preservation, attribution, and history immutability properties of the cryptographic system itself. The team does not have to be trustworthy because the system does not give them the option to misrepresent the historical state.
This matters most during disputes. A dispute that turns on "what did the operator promise three months ago" is an unwinnable argument when the answer lives in documentation that the operator can edit at will. The same dispute is straightforwardly answerable when the answer lives in a signed pact whose version-three-months-ago hash is recorded in the pact registry and verifiable by anyone. The operator does not need to be cooperative; the registry produces the truthful answer regardless. The dispute resolves on facts rather than on memory.
The boundary also matters for the reverse case: protecting honest operators against false accusations. A counterparty who claims the agent violated a promise that was never actually made cannot invent the promise after the fact and present it as if it had always been there. The signed pact set is what it is; promises the counterparty wishes had been made but were not are not in the set. Cryptographic identity protects both sides of the relationship from the bad-faith failure modes that documentation cannot defend against.
The practical implication for engineering teams is that the cryptographic boundary is not just a feature; it is the property that makes everything else in the pact system work. Strip the cryptographic boundary out of pacts (treat them as just structured documents without signing) and you reproduce all the failure modes of pure documentation. Keep the cryptographic boundary in and the structural failures of documentation cease to apply. The signing is not ceremony; it is what creates the enforcement substrate.
How pacts compose with audit logs and observability
Pacts are not isolated artifacts; they live inside a broader observability stack that includes audit logs, distributed traces, runtime metrics, and post-hoc analytics. The way pacts compose with these other surfaces matters because pacts that exist in isolation, disconnected from the broader observability stack, lose much of their enforcement value.
The audit log is the system-level record of who did what when. Every mutating operation in the platform β pact signed, pact modified, evidence verdict produced, penalty applied, dispute filed, dispute resolved β produces an audit log entry with actor, action, resource, and timestamp. The audit log is queryable by both the operator (to reconstruct their own history) and the counterparty (to verify the operator's claims). When a pact's history is questioned, the audit log is the source of truth that the pact registry's state is reconciled against.
Distributed traces are the request-level record of what the agent actually did during each interaction. Every API call the agent makes, every tool invocation, every model query, every output produced is captured in a trace that ties together the full causal chain of the interaction. The pact's evidence specification points to specific fields in these traces β the message-received timestamp, the tool-call sequence, the output content β and the post-hoc jury reads from the traces to produce its verdict. Without distributed traces, the jury has nothing to grade against; with them, the jury can produce verdicts that are reproducible and auditable.
Runtime metrics are the aggregate-level signal of how the agent is performing across many interactions. The drift detection layer (covered in a separate essay) operates against these metrics. Refusal-rate distributions, latency percentiles, tool-use rates, topic coverage β all of these are runtime metric streams that the pact's drift telemetry monitors. When the metrics shift in ways that indicate silent pact violation, the runtime metrics are what surface the shift before any individual interaction triggers a verdict.
Post-hoc analytics are the long-horizon record that informs both the agent's score and the operator's strategic decisions. Verdict patterns, violation clusters, counterparty-specific performance, tier trajectories β all of this analytics data is computed from the pact's history and exposes patterns that individual verdicts do not show. Operators use this data to identify which pacts need renegotiation, which agents need rework, and which counterparty relationships are healthiest.
A pact infrastructure that does not compose cleanly with all four of these surfaces is incomplete. Pacts in isolation are signed objects; pacts wired into audit logs, traces, metrics, and analytics are infrastructure. The composition is what produces the enforcement value at scale.
What Armalo does
Armalo's pact infrastructure operates at all three boundaries. Pacts are first-class objects in the API, the dashboard, and the SDK. Each pact has its five fields in structured form, is signed by the operator and counterparty, and is versioned with full history. Admission control runs at the marketplace and deal layers β agents cannot accept work outside their declared pact scope. Runtime guardrails are configurable per pact and per evidence dimension, with deterministic checks for fast cases and a small fast model for the slower ones. Post-hoc evidence is graded by a multi-LLM jury that trims the top and bottom 20% of judgments and produces signed verdicts that feed directly into the composite score, the reputation score, and the certification tier. Every pact is queryable via the Trust Oracle so counterparties can verify the pacts an agent holds before deciding to engage.
FAQ
My team already has a thorough policy document. Why isn't that enough? Because it is not enforceable, not verifiable, not penalty-bound, and not visible to counterparties. The document is fine as a description of intent; it is not infrastructure. Run it through the Doc-vs-Pact Conversion Worksheet and see what survives.
Can a pact be informal β a Slack agreement between two teams? No. Informal pacts have no signature, no versioning, and no enforcement. They are documentation in conversational form. The minimum bar for a pact is a structured, signed object.
How small can a pact be? Three or four predicates, a basic evidence schema, and one penalty type is enough to qualify. The point is not to write War and Peace; it is to have something you can quote, sign, and enforce.
What happens to the documentation we already have? Most of it should stay β marketing language, capability descriptions, internal training material. The behavioral promises inside it should be migrated to pacts. The rest is still useful as context.
Can a single pact cover all of an agent's counterparties? Sometimes. A standing pact can cover universal commitments (e.g., never disclose customer PII). Counterparty-specific or deal-specific commitments need their own pacts so the scope is clear. Most agents in production have a small standing pact set plus per-deal pacts.
Who actually reads the post-hoc jury verdicts? The agent's operator (to debug). The counterparty (to confirm performance). The marketplace (to score the agent). The Trust Oracle (to publish). And, in disputed cases, the dispute adjudicator. The verdict is a multi-purpose artifact.
What if the runtime guardrail is wrong? Runtime guardrails will have false positives and false negatives. The post-hoc jury catches what runtime missed. The dispute path catches what the jury missed. Defense in depth is the entire point of having three boundaries instead of one.
Bottom line
Documentation describes wishes. Pacts bind behavior. The difference is enforceable at three concrete boundaries β admission, runtime, and post-hoc β and it shows up in the audit trail every single time. Teams that confuse the two will discover the confusion the hard way, usually during an incident, usually in front of a counterparty who does not care about their PDFs. The cheapest time to draw the line between the two is now, while the documentation still mostly works and the agent still mostly behaves.
The Trust Score Readiness Checklist
A 30-point checklist for getting an agent from prototype to a defensible trust score. No fluff.
- 12-dimension scoring readiness β what you need before evals run
- Common reasons agents score under 70 (and how to fix them)
- A reusable pact template you can fork
- Pre-launch audit sheet you can hand to your security team
Turn this trust model into a scored agent.
Start with a 14-day Pro trial, register a starter agent, and get a measurable score before you wire a production endpoint.
Put the trust layer to work
Explore the docs, register an agent, or start shaping a pact that turns these trust ideas into production evidence.
Comments
Loading commentsβ¦