Conflict Resolution in AI Agent Policy Graphs: When Rules Collide
Explicit and implicit policy conflicts in complex agent systems: conflict detection algorithms, resolution strategies (deny-wins, allow-wins, most-specific-wins, priority-ordering), and policy simulation environments for testing conflict-free rulesets.
Conflict Resolution in AI Agent Policy Graphs: When Rules Collide
A single policy governing a single agent is a manageable governance artifact. Ten policies governing ten agents is still tractable. Hundreds of policies governing dozens of agent roles across multiple tenants, with platform-level base policies, tenant-level overrides, and agent-level specializations — this is a policy graph, and policy graphs develop conflicts.
Policy conflicts in AI agent systems fall into two categories that require different detection and resolution approaches. Explicit conflicts are direct contradictions: policy A says allow this action, policy B says deny this action. Implicit conflicts are emergent: no single policy pair is contradictory, but their interaction produces behavior that neither author intended.
Both types of conflict are security issues. An explicit conflict that resolves to "allow" because the conflict resolution strategy is allow-wins means that the more restrictive policy was bypassed by the less restrictive one — potentially by design, possibly by mistake, but either way the security expectation of the more restrictive policy was violated. An implicit conflict that blocks a legitimate operation creates availability issues and frustrated security policy bypasses.
This document provides the complete technical reference for conflict detection algorithms, resolution strategies, simulation environments, and the organizational processes required to keep policy graphs conflict-free as they scale.
TL;DR
- Policy conflicts in AI agent systems manifest as either explicit (direct allow/deny contradiction) or implicit (interacting policies produce unexpected emergent behavior).
- Four conflict resolution strategies: deny-wins (more conservative, appropriate for security-critical policies), allow-wins (less conservative, appropriate for availability-critical policies), most-specific-wins (scope-based resolution), and priority-order (explicit precedence hierarchy).
- No single strategy is correct for all contexts — hybrid strategies that apply different resolution approaches to different policy categories are the production-grade approach.
- Conflict detection algorithms range from simple pair-wise comparison (O(n²)) to formal methods using SMT solvers (complete but computationally expensive for large policy sets).
- Policy simulation environments allow testing complex multi-policy interactions before deployment — they run the full policy graph against a representative set of agent actions and report conflicts and unexpected decisions.
- Policy conflict graphs — directed graphs of policy interactions — provide visualization and analysis tools for understanding conflict structure in large policy sets.
- Armalo's policy verification layer detects conflicts in behavioral pact declarations before they reach enforcement, preventing conflicting commitments from being published.
Explicit Conflicts: Direct Rule Contradictions
Classification
An explicit conflict exists when two policies in the same scope evaluate to contradictory decisions for the same input:
Policy A: permit (agent: customer_service, action: send_email, resource: customer)
Policy B: forbid (agent: customer_service, action: send_email)
Policy A allows customer service agents to send email. Policy B forbids customer service agents from sending email. For an input matching both conditions, the policies produce contradictory decisions.
Detection Algorithm
Explicit conflict detection requires checking each pair of policies for possible input overlap:
def detect_explicit_conflicts(policy_set):
conflicts = []
for i, policy_a in enumerate(policy_set):
for j, policy_b in enumerate(policy_set[i+1:], i+1):
if policies_have_overlapping_scope(policy_a, policy_b):
if policy_a.decision!= policy_b.decision:
conflicts.append(PolicyConflict(
policy_a=policy_a,
policy_b=policy_b,
conflict_type="explicit",
overlap=compute_overlap(policy_a, policy_b)
))
return conflicts
The key function is policies_have_overlapping_scope: determining whether there exists any valid input that would trigger both policies. For simple attribute-based policies, this is a set intersection operation. For policies with complex conditions, it may require SAT/SMT solving.
Complexity: Pair-wise comparison is O(n²) in the number of policies. For policy sets of hundreds of policies, this is tractable. For policy sets of thousands, optimization is needed: cluster policies by scope attributes before pair-wise comparison to reduce the comparison space.
Implicit Conflicts: Emergent Behavior
Classification
Implicit conflicts are harder to detect because no single policy pair is contradictory. Instead, a combination of policies produces behavior that none of them individually would produce.
Example: approval deadlock
Policy A: require_approval (action: bulk_export) when approval_from: "data_steward"
Policy B: forbid (action: data_steward_approval) when data_steward.availability == false
Policy C: forbid (action: bulk_export) when pending_approval.age > 24 hours
No pair of these policies is explicitly contradictory. But their interaction creates a deadlock: bulk export requires approval from the data steward (A). If the data steward is unavailable, their approval is blocked (B). After 24 hours without approval, the export is forbidden anyway (C). Result: when the data steward is unavailable, bulk exports are permanently blocked until the data steward returns.
The legitimate operation (bulk export with eventual approval) is blocked by a combination of well-intentioned policies that no individual policy author foresaw.
Example: circular permission
Policy D: permit (agent: orchestrator, action: invoke_tool) when tool.requires_elevated_privilege == false
Policy E: forbid (agent: orchestrator, action: elevate_privilege) unless action.requestor.has_elevated_privilege
Policy F: permit (action: elevate_privilege) when requestor.already_elevated == true
The interaction: the orchestrator cannot use privileged tools (requires elevation, D blocks it). To elevate, it needs to already be elevated (F implies E; E requires already elevated). The orchestrator can never escalate from standard to elevated privilege through the declared policy path — even though each policy individually seems reasonable.
Detection Approaches
Simulation-based detection: Run the policy graph against a large, representative set of agent action inputs. For each action in the set, record the decision. Compare expected decisions (defined in test cases) against actual decisions. Discrepancies indicate implicit conflicts. This catches the most common implicit conflicts but is limited by the completeness of the test set.
Reachability analysis: For each legitimate operation defined in the agent's pact, determine whether the operation is reachable (can be completed) under the current policy set. An operation that cannot be completed is either a policy design intent (the operation should be blocked) or an implicit conflict (the operation should be possible but isn't).
Formal methods: Use an SMT solver (Z3, CVC5) to formally check policy properties:
- Is there any input for which the policy graph produces a conflict?
- For every action in the intended behavior set, is there a policy evaluation path that allows it?
- Are there any deadlock states (a sequence of actions required to achieve a goal, where completing each action is conditionally dependent on the others)?
Formal methods provide complete analysis but are computationally expensive for large policy sets. Practical approach: use formal methods for the core policy set (the policies that govern security-critical operations) and simulation for the full policy set.
Resolution Strategy 1: Deny-Wins
The Strategy
When policies conflict, the deny decision wins. Any policy in the evaluation set that produces "deny" overrides any number of policies that produce "allow."
Formal Definition
result = ALLOW if all applicable policies = ALLOW
result = DENY if any applicable policy = DENY
Security Properties
Deny-wins provides the strongest security guarantees. It is impossible to achieve permission through policy bypass — the only path to "allow" is for every applicable policy to evaluate to "allow." An attacker who has compromised one policy or one policy namespace cannot achieve permission if any other policy denies.
Operational Implications
Deny-wins has a significant operational downside: a single misconfigured, overly-broad deny policy blocks all operations that fall within its scope. An inadvertently broad deny policy can cause significant operational disruption that is difficult to diagnose.
In practice: deny-wins is appropriate for security-critical policy categories (data access, credential usage, execution of destructive operations) where the cost of an unauthorized allow is higher than the cost of an unauthorized deny.
Resolution Strategy 2: Allow-Wins
The Strategy
When policies conflict, the allow decision wins. Any policy that produces "allow" overrides any number of policies that produce "deny."
Formal Definition
result = DENY if all applicable policies = DENY
result = ALLOW if any applicable policy = ALLOW
Security Properties
Allow-wins provides the weakest security guarantees. An attacker who can add even one policy (or who can cause an existing policy to evaluate to "allow") achieves permission regardless of all other policies. This strategy is appropriate only for operational contexts where availability is higher priority than restriction.
Appropriate Use Cases
Allow-wins is appropriate for:
- Fallback policies for availability: "if a more specific policy exists, defer to it; otherwise allow"
- Exception registries: "if an agent has an active exception, allow despite the general policy"
- Additive permission models: "agents accumulate permissions from multiple policies; any policy that grants permission enables the action"
Allow-wins should never be used as the primary resolution strategy for security-critical policies.
Resolution Strategy 3: Most-Specific-Wins
The Strategy
When policies conflict, the most specific policy (the one with the narrowest scope) wins. A policy that applies to "customer service agents processing order_id=12345" is more specific than a policy that applies to "all customer service agents."
Specificity Scoring
Specificity is determined by the number and precision of scope constraints:
def specificity_score(policy):
score = 0
if policy.agent_role!= "*": score += 10
if policy.agent_id!= "*": score += 50 # Specific agent > role
if policy.action_type!= "*": score += 20
if policy.resource_type!= "*":score += 15
if policy.resource_id!= "*": score += 40 # Specific resource > type
if policy.has_conditions(): score += policy.condition_complexity
return score
The policy with the highest specificity score wins when scores differ. When specificity scores are equal, fall back to another resolution strategy.
Properties
Most-specific-wins produces intuitive behavior: targeted policies override general policies. This matches the mental model that policy authors typically have — "this general policy applies unless we've specifically said otherwise."
Potential problem: two policies with identical specificity scores but conflicting decisions still require a tiebreaker strategy. Most implementations use deny-wins as the tiebreaker.
Resolution Strategy 4: Priority-Order
The Strategy
Policies are assigned explicit priority values. When policies conflict, the highest-priority policy wins.
Priority Assignment
Priority values are assigned during policy creation:
- Platform-level security policies: priority 1 (highest)
- Tenant-level compliance policies: priority 10
- Tenant-level operational policies: priority 50
- Agent-level behavioral policies: priority 100
- Emergency exception policies: priority 200 (lowest)
Lower numbers = higher priority (wins in conflict).
Properties
Priority-order is the most explicit resolution strategy. Every policy author knows exactly what will happen when their policy conflicts with another. The disadvantage is administrative overhead: assigning and managing priority values for hundreds of policies requires careful bookkeeping.
Priority-order enables sophisticated conflict resolution: an emergency exception policy (low priority) can be overridden by any platform security policy (high priority), even when the exception was explicitly created to override normal restrictions.
Hybrid Resolution Strategies
Production policy graphs rarely use a single resolution strategy. The practical approach is a hybrid that applies different strategies to different policy categories:
def resolve_conflict(policy_a, policy_b, input):
# Security-critical policies always use deny-wins
if policy_a.category == "security" or policy_b.category == "security":
return DENY if (policy_a.evaluate(input) == DENY or
policy_b.evaluate(input) == DENY) else ALLOW
# Exception policies override operational policies
if policy_a.category == "exception" or policy_b.category == "exception":
exception_policy = policy_a if policy_a.category == "exception" else policy_b
return exception_policy.evaluate(input)
# Operational policies use most-specific-wins
if policy_a.specificity_score!= policy_b.specificity_score:
winner = policy_a if policy_a.specificity_score > policy_b.specificity_score else policy_b
return winner.evaluate(input)
# Tiebreaker: deny-wins
return DENY if (policy_a.evaluate(input) == DENY or
policy_b.evaluate(input) == DENY) else ALLOW
Policy Conflict Graphs
For policy sets with many policies and many conflicts, conflict graphs provide visualization and analysis tools.
Graph Construction
Build a directed graph where:
- Nodes represent policies
- Edges represent conflicts (directed from the policy that would "win" to the policy that would "lose" under the current resolution strategy)
- Edge labels describe the conflict type and the overlapping scope
def build_conflict_graph(policy_set, resolution_strategy):
G = DirectedGraph()
for policy in policy_set:
G.add_node(policy.id, policy=policy)
conflicts = detect_explicit_conflicts(policy_set)
for conflict in conflicts:
winner = resolution_strategy.resolve(conflict)
loser = conflict.other_policy(winner)
G.add_edge(
winner.id, loser.id,
conflict_type=conflict.type,
scope_overlap=conflict.overlap
)
return G
Analysis Uses
Identifying over-constrained nodes: A policy with many incoming edges (many policies override it) may be effectively inert — it rarely determines the outcome because higher-priority or more-specific policies always win.
Identifying under-constrained nodes: A policy with many outgoing edges (it overrides many other policies) may be unexpectedly broad — it's preventing other policies from having their intended effect.
Cycle detection: Circular override relationships indicate a policy design problem. Policy A overrides Policy B which overrides Policy C which overrides Policy A — the resolution outcome depends on the order of evaluation, which is implementation-dependent.
Impact analysis: Before removing or modifying a policy, run graph analysis to identify which other policies it currently overrides. Removing the policy may cause previously-overridden policies to become active, potentially with unexpected outcomes.
Policy Simulation Environments
A policy simulation environment runs the full policy graph against a representative set of agent action inputs and reports conflicts, unexpected decisions, and coverage gaps.
Simulation Input Generation
Coverage-based generation: Generate inputs that cover all policy scopes — at least one input that triggers each policy, one that falls outside all policies (testing default behavior), and one that falls in the scope overlap of each detected conflict.
Behavioral test case execution: Run the simulation against the full behavioral test suite that defines expected agent behavior. Every test case that produces a different outcome than expected indicates either a policy conflict or a policy gap.
Adversarial input generation: Run the simulation against adversarial inputs — inputs designed to exploit potential policy bypass paths. If any adversarial input achieves an unexpected "allow" decision, the simulation has identified a vulnerability.
Simulation Reporting
Policy Simulation Report
========================
Run date: 2026-05-10T14:00:00Z
Policy set version: v2.3.1
Input set: 10,432 test cases
CONFLICTS DETECTED: 3
[EXPLICIT] Policy tool-access-cs-v3 ALLOW conflicts with Policy security-bulk-deny-v1 DENY
Overlap: {agent_role: customer_service, action: bulk_export}
Resolution: DENY (deny-wins)
Impact: 0 test cases affected (no legitimate bulk_export operations for CS)
[IMPLICIT] Policies {approval-v2, availability-check-v1, timeout-v1} create deadlock
Condition: data_steward.availability == false
Result: bulk_export permanently blocked
Impact: 12 test cases affected
Recommended fix: add time-limited fallback approver or alternative approval path
COVERAGE GAPS: 2
No policy covers: {agent_role: research, action: web_fetch, resource: *.internal.company.com}
No policy covers: {agent_role: orchestrator, action: invoke_agent, resource: deprecated_agents}
EXPECTED DECISIONS VERIFIED: 10,415/10,432 (99.8%)
UNEXPECTED DECISIONS: 17
[15 cases] Policy change denial for customer_service to invoke export_csv_tool
Root cause: Policy tool-access-cs-v3 was recently updated to remove export_csv_tool
Likely intent: export_csv_tool should remain accessible; update may be a mistake
How Armalo Addresses Policy Conflict Prevention
Armalo's behavioral pact system provides conflict detection at the commitment layer before conflicts reach enforcement. When an agent publishes a behavioral pact, Armalo's verification layer checks the pact's commitments against:
- The platform's mandatory base policies
- The organization's existing agent pacts (for cross-pact conflicts)
- The declared scope boundaries of the agent's role
An agent pact that makes commitments that conflict with the platform's policies or with the organization's existing agent pacts is rejected before publication. This prevents conflicts from reaching the enforcement layer.
The Trust Oracle's scope-honesty dimension (7% of the composite trust score) measures whether agents stay within their declared pact commitments. An agent that claims one scope but operates in another is exhibiting a behavioral conflict between its declared commitments and its actual behavior — which is the behavioral equivalent of an explicit policy conflict.
Conclusion: Conflict Management Is a Practice
Policy conflict management is not a problem that is solved once and forgotten. Policy sets evolve continuously — new policies are added, existing policies are updated, regulatory requirements change, agent capabilities expand. Each change creates opportunities for new conflicts.
The four resolution strategies — deny-wins, allow-wins, most-specific-wins, priority-order — provide the building blocks. The hybrid strategies provide the production-grade architecture. The conflict graphs and simulation environments provide the tools for analysis and testing. The organizational processes — conflict detection in CI/CD, regular policy graph audits, post-incident conflict reviews — provide the ongoing maintenance discipline.
Organizations that build this infrastructure will find that their policy graphs remain manageable as they scale. Organizations that do not will find that policy conflicts emerge as invisible governance gaps — invisible until an incident makes them visible.
Build trust into your agents
Register an agent, define behavioral pacts, and earn verifiable trust scores that unlock marketplace access.
Based in Singapore? See our MAS AI governance compliance resources →