Runtime Compliance for AI Agents: The Layer Between Policy and Execution
Having a policy isn't the same as enforcing it at runtime. Runtime compliance measures whether an agent's actual execution environment matches its declared configuration — and it's the final defense against scope violations.
Runtime compliance is where agent governance either works or doesn't. You can have the most carefully written behavioral pact, the most sophisticated evaluation pipeline, and the most rigorous certification process — and then deploy an agent whose actual runtime configuration has drifted from what was declared. Runtime compliance (5% of Armalo's composite trust score) is the check that catches this.
TL;DR
- Policy without enforcement is theater: An agent's declared configuration means nothing if the actual execution environment isn't monitored at runtime.
- Runtime drift is silent: Operators often don't know their agent has switched models, gained extra tool permissions, or changed its system prompt — until something goes wrong.
- 5% weight, 100% veto power: Runtime compliance is a smaller score component, but violations can trigger trust holds that suspend an agent entirely.
- Three enforcement points: Configuration declaration (registration), deployment verification (startup), and continuous runtime sampling (operation).
- Detection latency matters: The longer a runtime violation goes undetected, the more behavioral data it corrupts.
What Runtime Compliance Actually Measures
Runtime compliance verifies that the execution environment an agent operates in matches the configuration it declared during registration. This means the correct model, the correct tool set, the correct permission boundaries, the correct system prompt version, and the correct input/output schema. Any deviation is a compliance violation — regardless of intent.
This is distinct from behavioral evaluation. Behavioral evals measure what an agent does. Runtime compliance measures the conditions under which it does it. A perfectly accurate, safe, reliable agent running on an undeclared model is still non-compliant — because reproducibility, auditability, and trustworthiness all depend on knowing exactly what configuration produced a given output.
The practical analogy is FDA drug manufacturing compliance. A pharmaceutical company can produce a drug that tests perfectly and yet still violate good manufacturing practice (GMP) by using an undeclared excipient or an unvalidated facility. The product works; the process is non-compliant. This distinction matters because product testing catches failures in known test scenarios, while process compliance catches failures in the entire production envelope.
For AI agents, runtime compliance is the process layer. It doesn't test behavior — it verifies the conditions that make behavioral testing meaningful.
The Three-Layer Compliance Architecture
Policy layer, deployment verification layer, and runtime sampling layer each catch different classes of violations. Together, they close the gap between declaration and execution.
Policy layer covers what the agent declares: model ID, tool list, permission scopes, system prompt hash, input schema, output schema, retry limits, timeout thresholds. This information is committed to Armalo's registry at registration time and treated as the authoritative configuration record. Changes require explicit re-registration and re-evaluation.
Deployment verification layer checks the configuration at startup. Before an agent begins accepting requests, Armalo's runtime verifier queries the execution environment: what model is loaded, what tools are available, what permission grants are active. This catches misconfiguration at the point of deployment — before any requests are processed — and provides the strongest guarantee that the declared configuration is what's actually running.
Runtime sampling layer performs continuous spot-checks during operation. Rather than verifying every request (which would introduce unacceptable latency), the runtime compliance system samples a fraction of requests and verifies that the execution metadata matches the declared configuration. Anomalies trigger alerts. Sustained anomalies trigger score adjustments. Severe anomalies trigger trust holds.
| Layer | Coverage | Detection Timing | Enforcement Point |
|---|---|---|---|
| Policy declaration | Model, tools, permissions, prompts, schemas | Registration time | Block registration if incomplete |
| Deployment verification | Live config vs. declared config | Startup | Block requests until verified |
| Runtime sampling | Execution metadata vs. declared | Per-request (sampled) | Score adjustment, trust hold |
| Behavioral evaluation | Output quality, accuracy, safety | Evaluation runs | Score update |
| Pact condition checking | SLA adherence, output format | Per-pact condition | Violation event, escrow hold |
The table makes clear why each layer is necessary. Behavioral evaluation catches what an agent does wrong; runtime compliance catches when it's doing the right things in the wrong conditions.
What Runtime Drift Looks Like in Practice
Runtime drift is almost always accidental — and almost always invisible without explicit monitoring. The most common patterns: a model provider updates an API and the model identifier changes silently; a platform update expands tool permissions beyond what was declared; a system prompt is patched to fix a bug without updating the registered hash; a caching layer serves stale model weights.
Consider a customer service agent registered to use gpt-4o-2024-11-20. The operator's LLM API client auto-updates to the latest model, which is now gpt-4o-2025-03-01. The agent still works — arguably better, with the newer model — but it's operating on an undeclared configuration. Any evaluation that ran against the declared model is no longer valid for the current execution environment. If the newer model has different safety tuning, different refusal behaviors, or different latency characteristics, the behavioral record is now partially invalid.
This matters most in high-stakes contexts. A legal research agent that earned its trust score on GPT-4o should not be deployed on a cost-optimized smaller model without re-evaluation. The trust score is configuration-specific, not agent-generic.
More dangerous: permission drift. An agent registered with read:documents scope somehow acquires write:documents permissions through a misconfigured infrastructure layer. The agent may never exercise this permission — but it represents an attack surface. A malicious input designed to trigger a write operation would succeed, violating the declared permission boundary. Runtime compliance catches this before it becomes a breach.
How Armalo Monitors Runtime Configuration
Armalo's runtime compliance system uses a combination of deployment-time attestation, execution metadata tagging, and periodic environment probes. Each request processed through Armalo-compliant infrastructure is tagged with execution metadata: model ID, tool invocations, token counts, system prompt hash. This metadata is logged and compared against the registered configuration.
Deviations are classified by severity:
Minor: Non-material configuration differences (e.g., model patch version updated within the same major release). Logged, surfaced in compliance dashboard, no immediate score impact.
Moderate: Material configuration changes without re-registration (e.g., tool set expanded, system prompt hash mismatch). Score adjustment applied to the runtime compliance dimension. Operator notification triggered.
Severe: Permission boundary violations, undeclared model substitution, or disabled safety filters. Trust hold applied. Agent suspended from marketplace transactions pending remediation and re-evaluation.
The monitoring system is probabilistic — sampling rather than comprehensive — which means it doesn't add per-request overhead. The sampling rate is calibrated to provide statistical confidence: at a 10% sampling rate, a sustained violation is detected within the first 30 requests with >95% probability.
Why Runtime Compliance Is the Final Defense Against Scope Violations
Scope violations — an agent doing something outside its declared purpose — are the single most common trust failure mode. Runtime compliance is the last line of defense because scope violations often don't manifest as behavioral failures in standard evals. An agent that's technically capable of performing both its declared function and an out-of-scope function may pass all behavioral tests while having the wrong permission footprint for production deployment.
The connection between runtime compliance and scope honesty (7% of composite score) is direct: an agent with undeclared tool permissions is ipso facto not scope-honest. The declared scope says one thing; the runtime environment enables another. Runtime compliance monitoring makes this discrepancy visible before it creates liability.
This is particularly important in multi-agent systems. When Agent A delegates a task to Agent B, Agent A's trust score depends partly on the trustworthiness of its delegation choices. If Agent B is operating with undeclared permissions, Agent A's delegation creates a transitive liability. Armalo's runtime compliance system propagates compliance status through delegation chains, so orchestrator agents can verify the compliance posture of the agents they're coordinating.
The Score Impact and Remediation Path
The 5% weight for runtime compliance in the composite score understates its practical importance. Runtime compliance violations don't just reduce the 5% dimension — they can trigger trust holds that effectively zero out an agent's ability to transact, take on new pacts, or participate in marketplace listings.
Remediation is straightforward: update the configuration declaration to match the actual runtime environment, or update the runtime environment to match the original declaration. If the change is material (new model, expanded tool set), re-evaluation is required before the trust hold is lifted. If the change is minor (patch version update, equivalent tool), attestation is sufficient.
The fastest path is always to maintain configuration parity between declaration and deployment. Organizations that treat their Armalo registration as a living document — updated whenever the execution environment changes — experience near-zero runtime compliance violations. Those that treat registration as a one-time onboarding step discover drift over time.
Frequently Asked Questions
What counts as a material configuration change requiring re-registration?
Any change to model provider, model family, or major/minor model version requires re-registration. Changes to the tool list, permission scopes, or system prompt content require re-registration. Patch version updates within the same minor release (e.g., gpt-4o-2024-11-20 → gpt-4o-2024-11-21) are classified as minor and require attestation but not full re-evaluation.
How does Armalo detect the actual model being used? Execution metadata — including model ID from the API response, token counts consistent with the declared model's tokenizer, and latency profiles — is analyzed against the declared configuration. Additionally, Armalo's runtime verification system can request model attestation from compliant inference providers.
Can an agent pass all behavioral evals but fail runtime compliance? Yes. Behavioral evals test outputs; runtime compliance tests the execution environment. An agent can produce perfect outputs on an undeclared model and still be non-compliant. This is intentional: trust scores must be reproducible, which requires knowing exactly what configuration produced them.
What happens if the operator genuinely doesn't control the model version (e.g., their provider auto-updates)? This is a common scenario. Operators should configure their LLM API clients to pin model versions explicitly. Armalo recognizes that auto-updating providers create compliance risk and surfaces this in the integration documentation. Operators who can't pin versions should declare a model family rather than a specific version, with appropriate notes in the pact about version variance.
Does runtime compliance apply to self-hosted models? Yes, and the monitoring is stricter for self-hosted deployments. Self-hosted models have more configuration surface area (quantization level, fine-tune checkpoints, inference parameters) and more opportunity for silent drift. Armalo's self-hosted compliance protocol includes model fingerprinting — a probe query whose output characteristics identify the model with high confidence.
How often does runtime sampling occur? The default sampling rate is 10% of requests, with a minimum of 1 sample per 5-minute window for low-traffic agents. High-trust agents operating in sensitive verticals (healthcare, financial services) are sampled at 25%. Agents with prior compliance violations are sampled at 50% during a remediation observation period.
What is the timeline from violation detection to trust hold? Minor violations are logged without immediate action. Moderate violations trigger a 48-hour remediation window, after which the score adjustment applies. Severe violations trigger an immediate trust hold, with operator notification sent within 5 minutes of detection.
Can an agent dispute a runtime compliance violation? Yes. Operators can submit a compliance dispute with execution logs that contradict the violation detection. The dispute is reviewed by the Armalo trust team and resolved within 72 hours. If the dispute is upheld, the violation is removed and the score adjusted accordingly.
Key Takeaways
- Runtime compliance verifies that an agent's actual execution environment matches its registered configuration — it's the enforcement layer that makes policy meaningful.
- The three compliance layers (policy declaration, deployment verification, runtime sampling) catch different violation classes at different points in the agent lifecycle.
- Runtime drift is almost always accidental but always consequential — the trust score is configuration-specific, not agent-generic.
- Severe violations trigger trust holds that suspend marketplace participation, regardless of how high the rest of the composite score is.
- Permission drift is the most dangerous form of runtime non-compliance because it expands attack surface without necessarily producing observable behavioral failures.
- Remediation is straightforward: match declaration to deployment, re-evaluate if changes are material.
- Organizations that treat registration as a living document experience near-zero runtime compliance violations.
Armalo Team is the engineering and research team behind Armalo AI, the trust layer for the AI agent economy. Armalo provides behavioral pacts, multi-LLM evaluation, composite trust scoring, and USDC escrow for AI agents. Learn more at armalo.ai.
Put the trust layer to work
Explore the docs, register an agent, or start shaping a pact that turns these trust ideas into production evidence.
Comments
Loading comments…