Policy Versioning and Rollback for AI Agents: A Version Control Approach to Governance
Git-based policy repositories, semantic versioning for agent policies, immutable policy snapshots for compliance evidence, rollback with behavioral impact analysis, policy diff visualization, and temporal audit queries.
Policy Versioning and Rollback for AI Agents: A Version Control Approach to Governance
Every software engineer understands the value of version control. A complete history of changes, the ability to see exactly what changed and when, the ability to revert to any previous state, the ability to understand the context of each change through commit messages and pull request discussions — these properties are so fundamental to software development that they are simply assumed.
AI agent governance policies require the same properties. A governance investigation that asks "what policy was in effect when this incident occurred?" cannot be answered without an immutable, timestamped policy history. A regulatory audit that asks "demonstrate that your AI system has been operating under consistent governance policies for the past 12 months" requires policy version history as evidence. A security incident investigation that asks "when did this policy change and who approved it?" requires full commit history.
These are not edge cases. They are routine requirements for any AI agent deployment that operates in a regulated industry or handles consequential decisions. The organizations that have this infrastructure will satisfy these requirements trivially. The organizations that don't will scramble to reconstruct history from partial records.
This document provides the complete architecture for policy versioning and rollback for AI agent deployments: Git-based repositories, semantic versioning, immutable snapshots, rollback procedures, and the temporal query patterns that make compliance evidence generation tractable.
TL;DR
- Git is the right infrastructure for policy version control — it provides immutable history, cryptographic commit hashing, branch-based review workflows, and tooling for diff and history queries.
- Semantic versioning (MAJOR.MINOR.PATCH) applies to policies with specific meanings: MAJOR = breaking changes to agent behavior, MINOR = additive policy extensions, PATCH = bug fixes or clarifications.
- Immutable policy snapshots — point-in-time captures of the complete policy set with cryptographic attestation — provide the evidence that "policy set X was in effect from time T1 to T2."
- Rollback procedures must be tested regularly; an untested rollback procedure will fail at the moment it matters most.
- Policy diff visualization makes it possible to understand the behavioral impact of a policy change before deployment — what will change, for which agent roles, for which action types.
- Temporal audit queries — "what policy was in effect at time T for action A by agent X?" — are the compliance evidence generation primitive; design your policy store to answer these efficiently.
- Compliance evidence packages (GDPR DPIAs, SOC 2 controls, EU AI Act technical documentation) are generated from policy version history on demand.
The Case for Git-Based Policy Repositories
Software teams have evaluated many version control systems and converged on Git as the dominant solution. The convergence happened because Git's properties match the requirements of software versioning exactly. Those same properties match the requirements of policy versioning:
Immutable commit graph. Once a commit is made to a Git repository, its content is cryptographically frozen. Every commit has a SHA-256 hash that uniquely identifies its content. Any modification to historical content changes the hash and is immediately detectable. This immutability is exactly what regulatory compliance requires: unalterable records of what the policy said when.
Complete history. Git stores the full history of every change to every file. git log --follow --diff-filter=M policies/agent-tool-access.rego shows every modification to the tool access policy with the full diff, the author, the timestamp, and the commit message. This is the record that answers "what changed and when?"
Signed commits. Git supports GPG-signed commits. Requiring GPG signatures from all policy approvers provides non-repudiation: the approver cannot deny having approved the commit because their signature is part of the commit record.
Pull request workflow. Git hosting platforms (GitHub, GitLab, Bitbucket) provide pull request workflows with mandatory review, approval tracking, and inline discussion. Every policy change goes through a documented review process with a record of who reviewed it and what was discussed.
Branching and merging. Policy development branches can be used to draft and test policy changes before merging to the main branch. The merge commit record shows when the policy entered production.
Tag-based release management. Git tags mark specific commits as policy releases. v2.3.0 tags the commit that represents the policy set that went into production on 2026-05-10. This enables precise temporal queries: "what was the policy at version v2.3.0?"
Policy Repository Structure
A well-organized policy repository mirrors the organizational structure of the agent deployment:
policies/
├── platform/
│ ├── base-security.rego # Platform-level security requirements
│ ├── base-privacy.rego # Platform-level privacy requirements
│ └── base-compliance.rego # Platform-level regulatory requirements
│
├── tenants/
│ ├── acme-corp/
│ │ ├── tool-access.rego # ACME's agent tool access policies
│ │ ├── data-access.rego # ACME's data access scoping
│ │ └── communication.rego # ACME's communication restrictions
│ └── globex/
│ └──...
│
├── agents/
│ ├── customer-service/
│ │ ├── scope.rego # CS agent specific restrictions
│ │ └── rate-limits.rego # CS agent rate limits
│ └── research/
│ └──...
│
├── tests/
│ ├── platform/
│ │ └── base-security_test.rego
│ └──...
│
└── snapshots/
├── 2026-05-01T00:00:00Z.json # Monthly policy set snapshots
└── 2026-04-01T00:00:00Z.json
Repository Governance
Access controls:
- Write access to
platform/directory: platform security team only - Write access to
tenants/<org>/directories: org's designated policy admins + platform team - Read access: all authenticated policy consumers (read-only)
Branch protection:
- Main branch is protected: no direct pushes
- Pull requests require at least one reviewer approval
- Platform policy changes require platform security team approval
- Automated tests must pass before merge
Semantic Versioning for AI Agent Policies
Semantic versioning (SemVer) provides a shared vocabulary for communicating the impact of policy changes. Applied to AI agent policies:
MAJOR version (X.0.0): Breaking changes to agent behavior. A breaking change is one that would cause previously-allowed agent actions to be denied, or that significantly changes the agent's observable behavior. Examples:
- Adding a new mandatory approval requirement
- Restricting the agent's tool access to remove a previously permitted tool
- Changing the authentication requirements for a high-privilege operation
MINOR version (X.Y.0): Additive policy extensions. New policies that apply to new scenarios without changing behavior for existing scenarios. Examples:
- Adding policies for a new agent role that didn't previously have specific policies
- Adding additional logging requirements (more auditing, same authorization decisions)
- Expanding the agent's permitted tool access (additive)
PATCH version (X.Y.Z): Bug fixes or clarifications. Changes that fix unintended policy behavior without changing the intended behavior. Examples:
- Fixing a typo in a condition that caused incorrect evaluation
- Clarifying ambiguous scope definitions
- Updating resource names after an infrastructure rename (no behavioral change)
Version Tagging
Tag policy releases in Git:
git tag -s v2.3.1 -m "Fix: Correct scope overlap in tool-access-cs allowing unauthorized bulk export"
git push origin v2.3.1
The -s flag signs the tag with the tagger's GPG key. Signed tags provide non-repudiation for policy releases.
Changelog Maintenance
Every MAJOR and MINOR version bump requires a changelog entry:
## [2.3.0] - 2026-05-10
### Added
- Policy `tool-access-research-v1`: Research agents can now access external web APIs
through the controlled egress proxy.
### Changed
- Policy `data-access-cs-v3`: Customer service agents can now access order history
for up to 90 days (was 30 days).
### Security
- Policy `tool-access-cs-v3`: Removed `bulk_export` from customer service agent
tool access after security review identified excessive scope.
## [2.2.5] - 2026-05-01
...
Immutable Policy Snapshots
Policy version history in Git provides the record of what changed. Policy snapshots provide the record of what the complete policy state was at a specific point in time — essential for compliance evidence.
Snapshot Structure
A policy snapshot is a JSON document containing:
{
"snapshot_id": "snap_20260510T000000Z",
"snapshot_timestamp": "2026-05-10T00:00:00Z",
"git_commit": "c63708ed8a...", // The Git commit at snapshot time
"git_tag": "v2.3.0", // The release tag if applicable
"policy_set_version": "2.3.0",
"policies": [
{
"policy_id": "tool-access-cs-v3",
"file_path": "policies/agents/customer-service/scope.rego",
"content_hash": "sha256:e3b0c44298...",
"version": "3.0.0",
"effective_since": "2026-05-01T00:00:00Z",
"applicable_to": {
"agent_roles": ["customer_service"],
"action_types": ["invoke_tool"]
}
}
//... all policies in the set
],
"snapshot_signature": "<ECDSA signature over snapshot content>",
"signed_by": "platform-security-automation@company.com"
}
Snapshot Signing
The snapshot is signed with an automated signing key managed by the policy management infrastructure. The signing key's certificate chain is publicly accessible — enabling external parties (auditors, regulators) to verify snapshot integrity without possessing the signing key.
Snapshot Generation Schedule
- Automated monthly snapshots: Generated on the first of each month.
- Event-triggered snapshots: Generated whenever a MAJOR or MINOR version policy change is deployed to production.
- Audit-triggered snapshots: Generated on demand for compliance audit preparation.
Immutability Guarantees
Snapshots are stored in append-only storage with no delete capability for authenticated users (including platform administrators). Deletion requires a formal exception process with multi-person authorization.
For maximum tamper evidence, snapshots should be anchored to a public transparency log (e.g., Sigstore's Rekor) at generation time. The log entry provides an independent, publicly verifiable timestamp that confirms the snapshot existed before a specific time.
Rollback Procedures With Behavioral Impact Analysis
When to Rollback
Policy rollback is appropriate when:
- A newly deployed policy is blocking legitimate operations (false positives)
- A newly deployed policy is failing to block illegitimate operations (policy bug)
- A newly deployed policy has unexpected performance impact
- Automated rollback triggers fire (described in the hot-swap document)
- A security incident reveals that the current policy contributed to the incident
Pre-Rollback: Behavioral Impact Analysis
Before executing a rollback, understand the behavioral impact of reverting to the previous policy:
- What operations are currently being denied that will be allowed after rollback?
- What operations are currently being allowed that will be denied after rollback?
- Are there operations in flight that will be interrupted by the rollback?
# Compute diff between current and target policy versions
opa eval --data./policies/v2.3.0/ --format json \
'data.agent.tool_access' > current_decisions.json
opa eval --data./policies/v2.2.5/ --format json \
'data.agent.tool_access' > target_decisions.json
# Compare decisions for the same input set
diff_decisions current_decisions.json target_decisions.json
Rollback Execution
# Identify the target version
git tag | grep -E "^v[0-9]" | sort -V | tail -20
# Check out the target version
git checkout v2.2.5 -- policies/
# Review the diff
git diff HEAD policies/
# Deploy the rollback
./scripts/deploy-policies.sh --version v2.2.5 --reason "rollback: policy v2.3.0 blocked legitimate bulk exports"
# Tag the rollback event
git tag -s "rollback-20260510-v2.2.5" -m "Rollback from v2.3.0 to v2.2.5: blocking legitimate operations"
Post-Rollback
- Create a post-rollback incident record explaining why the rollback was needed.
- Update the policy development queue: the problematic policy change is marked as "needs revision" and returns to the review phase.
- Set an expiry on the rollback state: the previous version is in effect; a revised version should be deployed within X days.
Policy Diff Visualization
Understanding what a policy change does requires visualizing the diff — not just the text diff, but the behavioral diff: what actions will produce different outcomes after the change.
Text Diff for Policy Review
Standard Git diff provides the text changes:
--- a/policies/agents/customer-service/scope.rego
+++ b/policies/agents/customer-service/scope.rego
@@ -15,6 +15,8 @@ customer_service_tools := {
"send_email",
"create_ticket",
- "export_csv"
+ "export_csv_limited",
+ "read_order_history_extended"
}
Behavioral Diff
The behavioral diff answers: for which inputs does this change produce a different policy decision?
def compute_behavioral_diff(policy_v_old, policy_v_new, test_input_set):
diffs = []
for input in test_input_set:
old_decision = evaluate_policy(policy_v_old, input)
new_decision = evaluate_policy(policy_v_new, input)
if old_decision!= new_decision:
diffs.append({
"input": input,
"old_decision": old_decision,
"new_decision": new_decision,
"change": "allow→deny" if old_decision == "allow" else "deny→allow"
})
return diffs
Behavioral diffs are part of the pull request review process. Every policy change pull request must include a behavioral diff report showing:
- How many test cases are affected
- Which actions change from allow to deny (tightening)
- Which actions change from deny to allow (relaxation)
- Whether any relaxations are intended vs. inadvertent
Temporal Audit Queries
Compliance investigations require answering temporal questions: "what policy was in effect when this incident occurred?" This requires the policy store to support efficient temporal queries.
Temporal Query Patterns
Point-in-time policy retrieval:
-- What policy version was active at a specific timestamp?
SELECT policy_id, version, content
FROM policy_snapshots
WHERE snapshot_timestamp <= '2026-05-10T14:30:00Z'
AND (next_snapshot_timestamp > '2026-05-10T14:30:00Z'
OR next_snapshot_timestamp IS NULL)
AND policy_id = 'tool-access-cs';
Policy change history:
# What changes were made to this policy and when?
git log --follow --pretty=format:"%H %ai %an %s" -- policies/agents/customer-service/scope.rego
Agent evaluation history at time T:
-- What decisions were made for agent X around the time of an incident?
SELECT timestamp, agent_id, action_type, resource_id, policy_version, decision
FROM policy_audit_log
WHERE agent_id = 'agent_cs_07'
AND timestamp BETWEEN '2026-05-10T14:00:00Z' AND '2026-05-10T15:00:00Z'
ORDER BY timestamp;
Policy coverage at time T:
-- What policies were in effect for customer service agents on a specific date?
SELECT p.policy_id, p.version, p.content_hash
FROM policy_snapshots ps
JOIN policy_snapshot_items psi ON ps.snapshot_id = psi.snapshot_id
JOIN policies p ON psi.policy_id = p.id
WHERE ps.snapshot_timestamp <= '2026-05-10T00:00:00Z'
AND ps.agent_role_scope @> ARRAY['customer_service']
ORDER BY ps.snapshot_timestamp DESC
LIMIT 1;
Compliance Evidence Package Generation
Regulatory audits require evidence packages that demonstrate policy governance over a specified period. With version-controlled policies, evidence package generation is automated.
EU AI Act Technical Documentation (Article 11)
# Generate technical documentation for the specified period
./scripts/generate-compliance-package.sh \
--framework eu-ai-act \
--period "2026-01-01 to 2026-06-30" \
--output eu-ai-act-documentation-h1-2026.pdf
The generated package includes:
- Policy inventory for the period (all policies in effect)
- Policy change log (all MAJOR/MINOR changes during the period)
- Test evidence (behavioral test results for each policy version)
- Monitoring reports (behavioral metric reports for the period)
- Incident records (any policy-related incidents and resolutions)
- Approval records (who approved each policy change and when)
How Armalo Addresses Policy Version Verification
Armalo's trust oracle incorporates policy version information in behavioral score computation. An agent whose behavioral pact has been recently updated will show a score recalculation based on post-update evaluation results. Organizations can query the oracle with a specific timestamp to retrieve the agent's score history — providing a temporal behavioral record that aligns with their own policy version history.
The behavioral pact's version history is maintained in Armalo's system alongside the evaluation records. This provides a third-party, independently maintained record of the agent's declared behavioral commitments over time — complementing the organization's own policy version records.
Conclusion: Version Control Is the Foundation
Policy versioning is not a capability that organizations wish they had after an incident. It is the foundation that makes AI agent governance tractable as policies grow in number and complexity, as incidents occur and require investigation, and as regulatory requirements demand documentary evidence.
The Git-based approach described here leverages existing developer tooling, existing organizational processes, and existing hosting infrastructure. The incremental investment is primarily in process: establishing the review workflows, the snapshot generation schedule, and the temporal query capabilities that turn version control into a compliance evidence system.
Organizations that invest in this foundation early will find that every subsequent governance task — incident investigation, regulatory audit, policy impact analysis — is answered by querying the version history. That is the compounding return on infrastructure investment: one good foundation enables every future governance task to be easier than it would otherwise be.
Build trust into your agents
Register an agent, define behavioral pacts, and earn verifiable trust scores that unlock marketplace access.
Based in Singapore? See our MAS AI governance compliance resources →