Behavioral Pacts as Programmable Contracts: A Working Tutorial on Parameter Binding for AI Agents
Behavioral pacts are pre-committed contracts that constrain the parameter shape of every tool call an agent invokes. This tutorial walks through the parameter-binding grammar — allow-list, deny-list, regex, value range, max amount, required — with worked examples across five domains (treasury, customer support, code execution, knowledge publishing, healthcare PHI), the Zod schema that backs the contract, and the continuous-time evaluator that enforces it.
Behavioral Pacts as Programmable Contracts: A Working Tutorial on Parameter Binding for AI Agents
If you have written an agent in 2026, you have written tool-call wrappers that look something like this:
async function transferFunds(params: { destination: string; amount: number; currency: string }) {
if (!isAllowedDestination(params.destination)) throw new Error('destination not allowed');
if (params.amount > 1000) throw new Error('amount above cap');
if (params.currency!== 'USDC') throw new Error('only USDC supported');
return await wallet.transfer(params);
}
The checks are correct. They are also invisible to every party except the operator running the agent. A counterparty bank, an oversight body, a procurement officer, a regulator under the EU AI Act — none of them can see that the checks exist. None of them can verify the checks were applied to a specific call. None of them can compare the checks across two organizations running the same agent. The checks live inside the agent's runtime, where they are useful for blocking misbehavior and useless for proving trust.
A behavioral pact is the same set of checks, lifted out of the agent's runtime and into a signed, pre-committed contract. The contract is recorded on the trust oracle. Every actual tool call is evaluated against the contract on ingest. The verdict is recorded. The compliance rate is a dimension of the agent's composite trust score. Counterparties read the score before transacting. Regulators read the contract for audit. The operator gets the same runtime enforcement as before, plus a cross-org-verifiable proof of behavior.
This tutorial is a working introduction to the parameter-binding grammar that backs Armalo pacts. We walk through the primitive rules, the Zod schema that defines them, and five worked examples — treasury, customer support, code execution, knowledge publishing, healthcare — each with the pact in the database shape and the wrapper in the agent runtime. Code samples are runnable against the @armalo/telemetry SDK published as @armalo/telemetry@0.1.0.
TL;DR for developers
- A pact is a row in the
pactstable. A pact hasconditions: PactCondition[]. A condition withtype: 'param_binding'carries aparameterBindingpayload describing which tool the binding applies to and what rules govern its parameters. - A parameter binding rule targets one parameter by
paramPath(dotted notation:destination,transfer.amount.value,recipient.email) and applies one or more of six rule kinds:allowList,denyList,regex,valueRange,maxAmount,required. - The server-side evaluator (
evaluateParamBindingsinapps/web/lib/pact-param-binding.ts) walks every binding whosetoolmatches the incoming call and applies every rule in turn. Violations are aggregated and returned with the condition's severity. - The evaluator runs on every
tool_callevent ingested by/api/v1/telemetry/events. It also runs synchronously onPOST /api/v1/pacts/{pactId}/validate-callfor pre-flight checks before the call executes. - Pacts compose. An agent can hold multiple active pacts for the same tool; every binding from every pact applies. Severity is taken as the highest severity among matching bindings.
The primitive rules
// From packages/validation/src/pacts.ts
export const paramBindingRuleSchema = z.object({
paramPath: z.string().trim().min(1).max(128),
allowList: z.array(z.string().min(1).max(256)).max(256).optional(),
denyList: z.array(z.string().min(1).max(256)).max(256).optional(),
regex: z.string().trim().min(1).max(512).optional(),
valueRange: z.object({ min: z.number().optional(), max: z.number().optional() }).optional(),
maxAmount: z.object({ amount: z.number().min(0), currency: z.string().trim().min(2).max(8) }).optional(),
required: z.boolean().default(false),
});
Six rule kinds. Most parameters will use one or two; an amount parameter typically uses valueRange plus maxAmount. The grammar is intentionally small. A small grammar composes well, audits well, and forces the contract author to think in terms of value shapes rather than imperative logic.
| Rule | Use when | Failure semantics |
|---|---|---|
allowList | Parameter takes a value from a small known set (currencies, environments, regions) | The value must be string-equal to a member of the list |
denyList | Parameter must not match a known-bad set (test wallets, deprecated endpoints, sandboxed code) | The value must not be string-equal to any member |
regex | Parameter has a structural shape (UUID, email, JWT, EVM address prefix, ISO-8601) | The value (coerced to string) must match the pattern |
valueRange | Parameter is numeric and bounded (amount, latency budget, retry count) | The number must lie within [min, max] inclusive |
maxAmount | Parameter is a typed monetary amount (per-currency caps) | The number must not exceed amount for the named currency |
required | Parameter must be present in the call | The value must not be undefined or null |
A binding can have multiple rules on the same paramPath. They compose conjunctively — a destination wallet must satisfy both allowList and regex if both are configured. They also stack across bindings: if two active pacts both define a binding on destination, the agent must satisfy both, and the recorded severity is the highest among the two.
Example 1 — Treasury transfer agent
The canonical L4 example. An autonomous treasury agent settles invoices from a corporate USDC wallet. The agent has scope to call transfer_funds with three parameters: destination (EVM address), amount (number), currency (ISO 4217 alpha-3 or token symbol).
Pact definition.
import type { PactCondition } from '@armalo/validation';
const treasuryPactConditions: PactCondition[] = [
{
type: 'param_binding',
operator: 'eq',
value: true,
severity: 'critical',
verificationMethod: 'deterministic',
description: 'Treasury transfer destination + amount + currency are pre-committed.',
parameterBinding: {
tool: 'transfer_funds',
rules: [
{
paramPath: 'destination',
allowList: [
'0xA11A50AB9AC2C39A3F0E64F0E7C5D2C30AC8A1C0',
'0xB22B50DE7E2C8B7CE49A8C12F8C6C2C4B5D6E7F8',
'0xC33C50FA9F4DBC52E08D9A8C57D1C2C5C6D7E8F9',
],
regex: '^0x[a-fA-F0-9]{40}$',
required: true,
},
{
paramPath: 'amount',
valueRange: { min: 0, max: 1000 },
required: true,
},
{
paramPath: 'currency',
allowList: ['USDC'],
required: true,
},
],
note: 'Treasury allow-list: $1000 cap per call. Closes the OAuth -> tool-call parameter authorization gap.',
},
},
];
SDK-side wrapper. Wrap the tool with instrumentTool from @armalo/telemetry so every invocation streams a tool_call event. The server evaluates the binding on ingest and returns a verdict in the response body of the next batch flush.
import { Telemetry } from '@armalo/telemetry';
const tel = new Telemetry({ apiKey: process.env.ARMALO_API_KEY! });
const safeTransfer = tel.instrumentTool({
sessionId,
agentId: process.env.ARMALO_AGENT_ID!,
tool: 'transfer_funds',
pactId: process.env.ARMALO_TREASURY_PACT_ID!,
fn: async (params: { destination: string; amount: number; currency: string }) => {
return await wallet.transfer(params);
},
});
await safeTransfer({
destination: '0xA11A50AB9AC2C39A3F0E64F0E7C5D2C30AC8A1C0',
amount: 250,
currency: 'USDC',
});
Pre-flight validation. When the cost of a wrong call is high, validate before executing.
const verdict = await fetch(`${ARMALO}/api/v1/pacts/${pactId}/validate-call`, {
method: 'POST',
headers: { 'Content-Type': 'application/json', 'X-Pact-Key': API_KEY },
body: JSON.stringify({
tool: 'transfer_funds',
params: { destination, amount, currency },
sessionId,
attemptedAt: new Date().toISOString(),
agentId,
}),
}).then((r) => r.json());
if (!verdict.valid) {
throw new Error(`Pact violation: ${verdict.severityHighest} — ${verdict.violations[0].reason}`);
}
What the binding catches. A destination not on the allow-list triggers critical because the destination paramPath fails allowList. An amount of $1850 triggers two violations (the valueRange and a future cumulative cap) at critical. A currency of USDT triggers critical via allowList. A missing parameter triggers critical via required. The granularity is per-parameter and per-rule, so the audit trail names exactly which rule failed.
Example 2 — Customer support agent
A customer support agent has scope to call send_refund (destination is a customer email, amount is the refund value, reason is a free-text field). The contract is looser than the treasury contract — the destination is not a small set — but the structural shape and the value range are still strict.
const supportPactConditions: PactCondition[] = [
{
type: 'param_binding',
operator: 'eq',
value: true,
severity: 'major',
verificationMethod: 'deterministic',
description: 'Refunds are bounded in amount, target an existing customer email, and carry a categorized reason.',
parameterBinding: {
tool: 'send_refund',
rules: [
{
paramPath: 'customer_email',
regex: '^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\\.[A-Za-z]{2,}$',
required: true,
},
{
paramPath: 'amount',
valueRange: { min: 0, max: 500 },
required: true,
},
{
paramPath: 'reason',
allowList: [
'damaged_item',
'never_arrived',
'wrong_item_shipped',
'duplicate_charge',
'customer_request_within_window',
'goodwill_credit',
],
required: true,
},
],
note: 'Support agents may refund up to $500 per call for one of six categorized reasons.',
},
},
{
type: 'param_binding',
operator: 'eq',
value: true,
severity: 'minor',
verificationMethod: 'deterministic',
description: 'Refund memos must not include PII or coupon codes.',
parameterBinding: {
tool: 'send_refund',
rules: [
{
paramPath: 'memo',
regex: '^(?:(?!SSN|EIN|card[ -]?number|coupon|promo).)*$',
},
],
},
},
];
Two pacts on the same tool. The first carries the major-severity functional constraints; the second carries a minor-severity content constraint on the memo field. The evaluator applies both; if the memo contains "SSN" the call records a minor violation and the major-severity bindings pass independently.
Observability. Every refund the agent issues lands as a tool_call event in the room ledger. The verdict — valid: true or valid: false with the failing rule named — is part of the event payload. The compliance rate is published as part of the agent's pact_compliance_rate score field, queryable via the trust oracle. A customer with a complaint can request the agent's behavioral record through the oracle and see precisely how often refunds passed or failed the contract.
Example 3 — Code execution agent
An agent that runs sandboxed code (Python or Node) on behalf of a user. The tool is run_code with parameters language, code, timeout_ms, network_egress_allowed. The contract here uses every rule kind.
const codeExecPactConditions: PactCondition[] = [
{
type: 'param_binding',
operator: 'eq',
value: true,
severity: 'critical',
verificationMethod: 'deterministic',
description: 'Code execution constrained to safe languages, no egress by default, short timeouts.',
parameterBinding: {
tool: 'run_code',
rules: [
{
paramPath: 'language',
allowList: ['python', 'node', 'typescript'],
required: true,
},
{
paramPath: 'code',
denyList: ['os.system', 'subprocess.Popen', 'eval(', 'exec(', '__import__'],
},
{
paramPath: 'timeout_ms',
valueRange: { min: 100, max: 30000 },
required: true,
},
{
paramPath: 'network_egress_allowed',
allowList: ['false'],
},
],
note: 'Sandboxed execution: no shell escape, no network, 30s ceiling. Egress requires a separate pact.',
},
},
];
The denyList on code is the simplest possible static analysis — it flags string-equal occurrences of dangerous symbols. The intent is not to be a complete sanitizer; it is to record the call's risk class to the behavioral ledger. If a customer requires egress, they install a separate pact (a child pact) that overrides the network_egress_allowed binding with a wider allow-list, and the child pact's allow-list reflects the customer-specific exception. The composition rule — highest severity wins — ensures the parent pact's egress constraint is still active for every other call.
Example 4 — Knowledge publishing agent
An agent that publishes blog posts, research papers, or product documentation on the operator's behalf. The tool is publish_post with parameters title, body, category, audience, scheduled_at. The binding here is structural rather than monetary.
const publishPactConditions: PactCondition[] = [
{
type: 'param_binding',
operator: 'eq',
value: true,
severity: 'major',
verificationMethod: 'deterministic',
description: 'Published content carries category, audience, and a scheduling timestamp; titles avoid forbidden phrases.',
parameterBinding: {
tool: 'publish_post',
rules: [
{
paramPath: 'title',
regex: '^.{8,200}$',
denyList: ['BREAKING', 'URGENT', 'GUARANTEED', 'FREE BITCOIN'],
required: true,
},
{
paramPath: 'body',
regex: '^[\\s\\S]{200,200000}$',
required: true,
},
{
paramPath: 'category',
allowList: ['engineering', 'research', 'product', 'security', 'leadership'],
required: true,
},
{
paramPath: 'audience',
allowList: ['public', 'customers', 'investors', 'internal'],
required: true,
},
{
paramPath: 'scheduled_at',
regex: '^\\d{4}-\\d{2}-\\d{2}T\\d{2}:\\d{2}:\\d{2}(?:\\.\\d+)?Z$',
required: true,
},
],
},
},
];
The contract here is closer to a content style guide than a security policy, and the L4 substrate is correspondingly more useful as a documentation surface than as a runtime block. The compliance rate is still recorded and is still part of the agent's composite trust score — an agent that publishes off-category posts at high rate is recorded as having drifted from its mission, which is meaningful behavioral information even when the individual posts are not harmful.
Example 5 — Healthcare PHI agent
A regulated agent that processes patient records and emits structured fields to a downstream EHR. The tool is update_patient_record and the contract is severity critical everywhere because the parameters touch PHI.
const phiPactConditions: PactCondition[] = [
{
type: 'param_binding',
operator: 'eq',
value: true,
severity: 'critical',
verificationMethod: 'deterministic',
description: 'PHI fields are pre-committed in shape; the patient identifier is an opaque org-scoped token, not a raw SSN/MRN.',
parameterBinding: {
tool: 'update_patient_record',
rules: [
{
paramPath: 'patient_token',
regex: '^ptn_[A-Za-z0-9]{24}$',
required: true,
},
{
paramPath: 'icd10_code',
regex: '^[A-TV-Z][0-9][A-Z0-9](?:\\.[A-Z0-9]{1,4})?$',
required: true,
},
{
paramPath: 'systolic_bp',
valueRange: { min: 50, max: 250 },
},
{
paramPath: 'diastolic_bp',
valueRange: { min: 30, max: 150 },
},
{
paramPath: 'notes_free_text',
regex: '^(?:(?!\\b\\d{3}-\\d{2}-\\d{4}\\b)(?!\\b[A-Z0-9]{8,12}\\b).)*$',
},
],
note: 'PHI binding: patient identifier is tokenized, ICD-10 is structurally validated, vital signs are physiologically bounded, free-text rejects SSN and MRN-shaped strings.',
},
},
];
Two notes on the PHI binding. First, the notes_free_text rule is a negative regex — it rejects strings containing SSN-shaped or MRN-shaped tokens. Negative lookaheads are well-supported in the regex engine the evaluator uses, and the pattern is intentionally conservative; the operator's responsibility is to set the pattern, and the L4 substrate's responsibility is to apply it consistently and record the result. Second, the patient_token regex enforces a structural opacity — the agent must never see a raw MRN or SSN; it sees an org-scoped opaque token, and any call passing a raw identifier would fail the regex. This is a contractual enforcement of a separation-of-concerns property that the L1–L3 stack does not natively express.
How the evaluator works
The server-side evaluator lives at apps/web/lib/pact-param-binding.ts:181 and exports evaluateParamBindings(conditions, call). Pseudocode:
function evaluateParamBindings(conditions: PactCondition[], call: { tool: string; params: Record<string, unknown> }) {
const violations: Violation[] = [];
for (const condition of conditions) {
if (condition.type!== 'param_binding') continue;
const binding = condition.parameterBinding;
if (!binding || binding.tool!== call.tool) continue;
for (const rule of binding.rules) {
const value = walkParamPath(call.params, rule.paramPath);
if (rule.required && (value === undefined || value === null)) {
violations.push({ rule, severity: condition.severity, reason: 'missing required parameter' });
continue;
}
if (value === undefined || value === null) continue; // optional, skipped
if (rule.allowList &&!rule.allowList.includes(String(value))) {
violations.push({ rule, severity: condition.severity, reason: 'value not in allow-list' });
}
if (rule.denyList && rule.denyList.includes(String(value))) {
violations.push({ rule, severity: condition.severity, reason: 'value in deny-list' });
}
if (rule.regex &&!new RegExp(rule.regex).test(String(value))) {
violations.push({ rule, severity: condition.severity, reason: 'value did not match regex' });
}
if (rule.valueRange) {
const n = Number(value);
if (Number.isNaN(n) ||
(rule.valueRange.min!== undefined && n < rule.valueRange.min) ||
(rule.valueRange.max!== undefined && n > rule.valueRange.max)) {
violations.push({ rule, severity: condition.severity, reason: 'value out of range' });
}
}
if (rule.maxAmount) {
const n = Number(value);
if (Number.isNaN(n) || n > rule.maxAmount.amount) {
violations.push({ rule, severity: condition.severity, reason: 'amount exceeds cap' });
}
}
}
}
const severityHighest = pickHighestSeverity(violations);
return { valid: violations.length === 0, violations, severityHighest };
}
The evaluator is purely synchronous, side-effect-free, deterministic, and returns within microseconds for typical pact sizes. It runs server-side, never in the agent's runtime, so a compromise of the agent does not compromise the evaluation. The result is recorded as the validation field of the room_events.payload for the corresponding tool_call event and is also returned in the response body of POST /api/v1/telemetry/events so the SDK can surface the verdict to the caller.
Composition patterns
Layered pacts. An agent can hold multiple active pacts. The standard layering pattern is: a base pact for the agent's core mission, a tenant pact for the customer-specific constraints (allow-listed destinations, audience-specific content policies), and an operator pact for the originating org's universal policies (no shell escape, no PHI in free text). The evaluator applies all active pacts; the union of rules governs the call.
Versioned pacts. Pacts are versioned by (orgId, slug, version) (the unique index on the pacts table). When the contract changes, the new version is created as an active pact and the prior version is archived. The behavioral record carries the pact ID per call, so historical compliance is recoverable per version — the agent's score at any point in time can be recomputed against the version of the pact that was active at that time.
Bilateral pacts. A pact can include a counterpartyAgentId, which the evaluator does not currently bind on (counterparty enforcement is a future grammar extension), but which is anchored on-chain. The bilateral pact is signed by both parties and is queryable by both. This is the substrate for agent-to-agent commerce where the counterparty needs to verify the calling agent's contract before transacting.
Templated pacts. A pact with is_template: true is published by Armalo Labs and is forkable. Templates carry industry-standard bindings for the common tool surfaces — treasury, support, code execution, PHI — and can be instantiated with org-specific parameters at install time. Templates compose the cost of pact authorship across an entire industry rather than forcing each org to re-derive the contract.
What pact bindings do not catch
The grammar is small. It does not natively express:
- Cross-call constraints (window-level aggregates, sequence dependencies). These are coming in the next grammar extension; the workaround in the meantime is to compute the aggregate in a sidecar evaluator and post the result as a
responseevent with a derivedseverity. - Semantic constraints on free text beyond regex. The PHI free-text rule above is a regex against SSN-shaped tokens, not a semantic classifier for sensitive content. A jury-type condition (with
verificationMethod: 'jury') handles the semantic layer and runs asynchronously. - Dependencies between parameters ("if currency is USD, amount cap is $1000; if currency is BTC, amount cap is 0.05"). A workaround is to author one binding per currency.
These gaps are known and tracked. The grammar is meant to cover the dominant 80% of agent tool-call structure with primitive rules that audit cleanly. The remaining 20% is the domain of richer condition types — jury, evaluation-rubric, custom — that complement the parameter-binding primitives rather than replace them.
Live reference
Atlas, the public Armalo L4 reference agent, runs the treasury pact from Example 1 in production. The pact is at pact ID f683147e-5bfc-4b43-aa6f-e932a4262035, the agent is at agent ID 76cf31d6-ffe3-4a5c-8748-021114aa8066. The live demo at armalo.ai/l4/demo shows the pact, the live telemetry stream, and a deliberately seeded violation in session three. Every developer surface in this tutorial is testable against Atlas.
The fastest path to your own pact is:
npm i @armalo/telemetry
Then issue an API key, author your pact via the dashboard or the API (POST /api/v1/pacts), wrap your tools with instrumentTool, and watch the telemetry land in your room ledger.
Further reading
- The L4 specification (canonical paper)
- The OAuth wire-fraud field guide — five attack archetypes, each closed by parameter binding
- @armalo/telemetry on npm
- TOCTOU theorem for agent trust — why the binding must be continuous
- Parameter binding grammar coverage paper — empirical study of how much of the attack surface the current grammar reaches
Build trust into your agents
Register an agent, define behavioral pacts, and earn verifiable trust scores that unlock marketplace access.
Based in Singapore? See our MAS AI governance compliance resources →