Trust Decay Functions: Modeling How AI Agent Reliability Degrades Over Time Without Reinforcement
Trust should decay without fresh evidence. Exponential decay vs. step-function models, lookback windows, time-weighted trust scores, domain-specific decay rates, and production implementation of trust decay in agent scoring systems.
Trust Decay Functions: Modeling How AI Agent Reliability Degrades Over Time Without Reinforcement
Credit scores require fresh credit activity to remain meaningful. A FICO score computed entirely from activity more than five years ago is a poor predictor of current creditworthiness. Insurance actuaries discount loss history from more than seven years ago when pricing policies. Background checks look at recent history more heavily than distant history. Every mature domain that quantifies trustworthiness has developed mechanisms to discount stale evidence.
AI agent trust systems are only beginning to grapple with this problem. Most current implementations treat trust scores as relatively static assessments that change primarily when new evaluations are run or when incidents occur. They do not model the passive decay that should occur when an agent goes without fresh behavioral evidence — when no one has evaluated it recently, when it has been running without incident monitoring, or when significant time has passed since its last adversarial evaluation.
This is a mistake. An AI agent that was reliably trustworthy twelve months ago may be significantly less trustworthy today — not because anything went wrong, but because the world has changed around it. Knowledge has become stale. The input distribution has shifted. New attack techniques have been developed that weren't tested against. The model provider has made undisclosed changes to the underlying model. None of these changes require an incident to occur before trust should be reduced.
Trust decay functions model this reality mathematically, ensuring that trust scores reflect the current state of evidence rather than a static assessment that becomes increasingly stale over time.
TL;DR
- Trust should decay in the absence of fresh evidence because deployed AI agents become less reliable over time even without explicit changes
- Exponential decay functions are appropriate for domains with continuous drift risk; step-function decay is appropriate for domains with discrete staleness thresholds
- Lookback windows define what time period of evidence is considered relevant; evidence outside the lookback window receives zero weight
- Different trust dimensions decay at different rates: calibration may decay faster than accuracy; adversarial robustness decays faster than scope adherence
- Time-weighted trust scores assign greater weight to recent evidence, allowing trust to be rebuilt through fresh positive evidence
- Armalo's trust scoring applies domain-specific decay rates with explicit decay curve parameters for each trust dimension
The Case for Trust Decay: Why Agent Reliability Degrades Over Time
Before specifying decay models, it's worth establishing why trust should decay at all. If an agent performed reliably in an evaluation conducted three months ago, why should its trust score be lower today without any new negative evidence?
Reason 1: Knowledge Staleness
For knowledge-dependent agents (RAG systems, fine-tuned models with domain-specific knowledge), the world changes continuously. An agent's knowledge base may be accurate at evaluation time and increasingly inaccurate at later times, through no failure of the agent's mechanisms. The trust decay model should reflect this: a knowledge-dependent agent whose knowledge was last evaluated three months ago in a domain that changes monthly should have lower trust than one whose knowledge was evaluated last week.
Reason 2: Distribution Shift
AI agents are calibrated to behave reliably on distributions similar to their evaluation distribution. As the input distribution shifts — different users, different tasks, different phrasing patterns, new use cases — the agent's performance on the new distribution may differ from its evaluated performance. The longer the time since the last comprehensive evaluation, the more the current distribution may have drifted from the evaluation distribution.
Reason 3: Model Provider Changes
For agents built on third-party model providers (OpenAI, Anthropic, Google), the underlying model may change without notice. Fine-tuned models may be re-based on new base model versions. Inference API behaviors may shift. Safety filtering may be updated. Any of these changes can alter an agent's behavior in ways that weren't anticipated in the original evaluation.
The trust implications of undisclosed model changes are significant: an evaluation result for "our customer service agent based on GPT-4o" may not be valid after the GPT-4o model has been updated, even if the update appears minor. Without re-evaluation, the trust score is increasingly based on assumptions about model behavior that may no longer hold.
Reason 4: Adversarial Technique Evolution
The adversarial ML research community continuously develops new attack techniques. An agent that achieved high adversarial robustness scores in an evaluation that tested against techniques current at the time of evaluation may be vulnerable to newer techniques that weren't in the test battery. As time passes, the set of techniques not covered by the evaluation grows, and the adversarial robustness claim becomes increasingly incomplete.
Reason 5: Dependency Updates
Agent platforms are complex systems with many dependencies: model providers, tool APIs, retrieval infrastructure, monitoring systems. Security vulnerabilities in any of these dependencies can affect agent behavior. An agent that was evaluated when its dependencies were secure may be less secure after a dependency vulnerability is disclosed — even if the vulnerability hasn't been exploited yet.
Trust Decay Models
Three primary decay models are applicable to AI agent trust, each with different mathematical properties and appropriate use cases.
Model 1: Exponential Decay
Exponential decay is the most natural model for trust in systems where reliability degrades continuously and gradually. The trust score at time t is:
T(t) = T₀ × e^(-λt)
Where:
- T₀ is the trust score at the time of the last evaluation
- λ is the decay rate constant
- t is the time elapsed since the last evaluation
The half-life H (time for trust to decay to half its original value) is related to λ by: H = ln(2) / λ
Half-life recommendations by domain and trust dimension:
| Domain | Knowledge Accuracy | Adversarial Robustness | Scope Adherence | Calibration |
|---|---|---|---|---|
| Financial data | 7 days | 90 days | 365 days | 30 days |
| Regulatory compliance | 30 days | 90 days | 180 days | 45 days |
| Product documentation | 14 days | 90 days | 365 days | 60 days |
| General enterprise | 60 days | 90 days | 365 days | 90 days |
| Scientific/technical | 90 days | 180 days | 730 days | 180 days |
Implementation:
import math
from datetime import datetime, timedelta
from typing import Dict, Optional
class ExponentialTrustDecay:
"""Model trust decay using exponential decay function."""
def __init__(self, half_lives: Dict[str, float]):
"""
half_lives: dict of {dimension: half_life_days}
e.g., {'accuracy': 60.0, 'calibration': 30.0, 'adversarial_robustness': 90.0}
"""
self.half_lives = half_lives
self.decay_constants = {
dim: math.log(2) / hl
for dim, hl in half_lives.items()
}
def decayed_trust(
self,
original_trust: float,
dimension: str,
evaluation_timestamp: datetime,
current_timestamp: Optional[datetime] = None
) -> float:
"""
Apply exponential decay to a trust score.
original_trust: trust score at time of evaluation [0, 1]
dimension: which trust dimension (affects decay rate)
evaluation_timestamp: when the evaluation was conducted
current_timestamp: current time (defaults to now)
"""
if current_timestamp is None:
current_timestamp = datetime.utcnow()
elapsed_days = (current_timestamp - evaluation_timestamp).total_seconds() / 86400
lambda_d = self.decay_constants.get(dimension, 0.01) # Default slow decay
decayed = original_trust * math.exp(-lambda_d * elapsed_days)
# Floor: trust doesn't decay below a minimum (uncertainty, not zero)
minimum_trust = 0.40 # Without evidence, assume uncertain (not untrustworthy)
return max(decayed, minimum_trust)
def composite_decayed_trust(
self,
evaluation_results: Dict[str, float],
dimension_timestamps: Dict[str, datetime],
dimension_weights: Dict[str, float]
) -> float:
"""
Compute composite trust score with dimension-specific decay applied.
"""
weighted_sum = 0
total_weight = 0
for dimension, original_score in evaluation_results.items():
timestamp = dimension_timestamps.get(dimension)
if timestamp is None:
# No evaluation data for this dimension
decayed_score = 0.40 # Minimum trust for unevaluated dimension
else:
decayed_score = self.decayed_trust(original_score, dimension, timestamp)
weight = dimension_weights.get(dimension, 1.0)
weighted_sum += decayed_score * weight
total_weight += weight
return weighted_sum / total_weight if total_weight > 0 else 0.40
Model 2: Step-Function Decay
Step-function decay is appropriate for dimensions where the trust reduction is discrete rather than continuous — where there is a clear threshold after which the evaluation result is no longer considered valid.
The trust score is maintained at its evaluated value until the first threshold is crossed, then steps down to a reduced value, and steps down again at subsequent thresholds.
Step function parameters:
| Evaluation Age | Trust Multiplier | Interpretation |
|---|---|---|
| < 30 days | 1.00 | Full credit |
| 30-60 days | 0.90 | Minor discount for age |
| 60-90 days | 0.75 | Moderate discount |
| 90-180 days | 0.55 | Significant discount — evaluation approaching expiry |
| 180+ days | 0.30 | Expired evaluation — near-floor trust |
Step-function decay is most appropriate for adversarial robustness, where the key question is "was this technique tested in this evaluation?" A 45-day-old adversarial evaluation is still largely valid (few completely new techniques emerge in 45 days), but a 200-day-old evaluation may miss many techniques developed since.
Model 3: Evidence-Weighted Decay
Evidence-weighted decay assigns weights to individual evidence events based on their age, so that the trust score at any time reflects a weighted aggregate of all evidence with recent evidence weighted more heavily:
T(current) = Σ evidence_i × e^(-λ × age_i) / Σ e^(-λ × age_i)
This model is the most nuanced: rather than applying a single decay function to a single evaluation score, it applies decay to each individual piece of evidence and recomputes the aggregate score. This means that trust can be rebuilt incrementally through fresh positive evidence without requiring a full formal re-evaluation.
Implementation:
class EvidenceWeightedTrust:
"""Trust score based on evidence-weighted decay of all behavioral observations."""
def __init__(self, half_life_days: float = 60):
self.lambda_d = math.log(2) / half_life_days
def compute_trust(
self,
evidence_events: List[dict],
current_time: Optional[datetime] = None
) -> float:
"""
Compute trust score from evidence events with exponential weighting by age.
evidence_events: list of {
'trust_impact': float in [-1, 1],
'timestamp': datetime,
'evidence_quality': float in [0, 1] # weight multiplier
}
"""
if current_time is None:
current_time = datetime.utcnow()
if not evidence_events:
return 0.50 # No evidence: neutral trust
weighted_sum = 0
weight_sum = 0
for event in evidence_events:
age_days = (current_time - event['timestamp']).total_seconds() / 86400
time_weight = math.exp(-self.lambda_d * age_days)
quality_weight = event.get('evidence_quality', 1.0)
combined_weight = time_weight * quality_weight
weighted_sum += event['trust_impact'] * combined_weight
weight_sum += combined_weight
if weight_sum == 0:
return 0.50
# Normalize to [0, 1] range (trust_impact in [-1, 1] → trust in [0, 1])
raw_score = weighted_sum / weight_sum
trust = (raw_score + 1) / 2
return max(0.30, min(1.0, trust)) # Bounded to [0.30, 1.00]
def effective_evidence_half_life(self) -> timedelta:
"""The time at which evidence weight is halved."""
return timedelta(days=math.log(2) / self.lambda_d)
Lookback Windows
A lookback window defines the maximum age of evidence that is considered when computing a trust score. Evidence older than the lookback window is not included in the computation, regardless of its decay weight.
Lookback windows complement decay functions: decay functions reduce the weight of old evidence gradually; lookback windows set a hard cutoff beyond which evidence is simply not counted.
Lookback window guidance:
| Evidence Type | Lookback Window | Rationale |
|---|---|---|
| Adversarial evaluation results | 90 days | New attack techniques emerge; older evaluations have increasing coverage gaps |
| Calibration audit results | 60 days | Calibration can drift significantly in 60 days with input distribution shift |
| Accuracy probe results | 30-90 days | Domain-dependent (financial: 30d; scientific: 90d) |
| Deployment incident record | 365 days | Incidents are relatively infrequent; longer window needed for statistical validity |
| Corpus freshness measurement | 7 days | RAG corpus freshness is a near-real-time property |
Setting lookback windows requires domain expertise — windows that are too short produce trust score instability (scores fluctuate wildly based on recent events), while windows that are too long allow stale evidence to unduly influence current trust assessments.
Domain-Specific Decay Rate Configuration
Implementing trust decay in production requires domain-specific configuration. The decay rates appropriate for a financial data agent are different from those for a general-purpose assistant.
Decay Rate Configuration Framework
DOMAIN_DECAY_CONFIGS = {
'financial_data': {
'accuracy': {'half_life_days': 7, 'lookback_days': 14, 'floor': 0.35},
'calibration': {'half_life_days': 14, 'lookback_days': 30, 'floor': 0.40},
'adversarial_robustness': {'half_life_days': 60, 'lookback_days': 90, 'floor': 0.45},
'scope_adherence': {'half_life_days': 180, 'lookback_days': 365, 'floor': 0.50},
},
'regulatory_compliance': {
'accuracy': {'half_life_days': 30, 'lookback_days': 60, 'floor': 0.40},
'calibration': {'half_life_days': 30, 'lookback_days': 60, 'floor': 0.45},
'adversarial_robustness': {'half_life_days': 60, 'lookback_days': 90, 'floor': 0.45},
'scope_adherence': {'half_life_days': 180, 'lookback_days': 365, 'floor': 0.50},
},
'general_enterprise': {
'accuracy': {'half_life_days': 60, 'lookback_days': 120, 'floor': 0.40},
'calibration': {'half_life_days': 60, 'lookback_days': 120, 'floor': 0.40},
'adversarial_robustness': {'half_life_days': 90, 'lookback_days': 180, 'floor': 0.45},
'scope_adherence': {'half_life_days': 365, 'lookback_days': 730, 'floor': 0.50},
}
}
Trust Score Transparency: Showing Decay State
Trust scores exposed to deployers should be transparent about their decay state:
{
"agent_id": "agent_abc123",
"composite_trust_score": 0.74,
"score_timestamp": "2026-05-10T12:00:00Z",
"dimension_scores": {
"accuracy": {
"current_score": 0.76,
"last_evaluated_score": 0.91,
"last_evaluated_at": "2026-02-15T10:00:00Z",
"days_since_evaluation": 84,
"decay_applied": 0.84,
"next_expiry_at": "2026-05-15T10:00:00Z"
},
"adversarial_robustness": {
"current_score": 0.81,
"last_evaluated_score": 0.93,
"last_evaluated_at": "2026-03-01T10:00:00Z",
"days_since_evaluation": 70,
"decay_applied": 0.87,
"next_expiry_at": "2026-05-30T10:00:00Z"
},
"calibration": {
"current_score": 0.71,
"last_evaluated_score": 0.88,
"last_evaluated_at": "2026-01-10T10:00:00Z",
"days_since_evaluation": 120,
"decay_applied": 0.81,
"next_expiry_at": "2026-03-11T10:00:00Z",
"status": "evaluation_overdue"
}
},
"re_evaluation_recommended": true,
"re_evaluation_urgency": "high",
"urgency_rationale": "Calibration evaluation overdue; accuracy evaluation approaching expiry"
}
Decay-Aware Alerting and Operational Response
Trust decay models aren't just useful for computing scores — they create a natural alerting framework for AI operations teams. When dimensions approach decay thresholds, automated alerts can prompt operators to schedule re-evaluation before the score falls below operational minimums.
Alert Level Configuration
A practical decay-aware alerting framework has three alert levels:
Yellow Alert (Approaching Expiry): A trust dimension's decay-applied score is projected to fall below the operator's defined operational minimum within the next 30 days. Action: schedule re-evaluation within 21 days.
Orange Alert (Evaluation Overdue): A trust dimension's evaluation has exceeded its lookback window — the evidence is no longer counted. The dimension is operating on prior positive evidence with maximum decay applied. Action: schedule immediate re-evaluation; notify deployers that the trust basis for this dimension is degraded.
Red Alert (Trust Below Operational Minimum): The composite trust score (with decay applied) has fallen below the operator's defined operational minimum. Action: trigger human review before continuing automated deployment; consider pausing agent for low-stakes operations until re-evaluation is completed.
class DecayAwareAlertSystem:
"""Generate alerts based on projected trust score decay."""
def __init__(self, operational_minimum: float = 0.70):
self.operational_minimum = operational_minimum
def evaluate_alerts(
self,
dimension_scores: dict,
decay_config: dict,
current_time: datetime = None
) -> list[dict]:
"""
Evaluate which trust dimensions should trigger alerts.
Returns list of alert objects with level, dimension, and recommended action.
"""
if current_time is None:
current_time = datetime.utcnow()
alerts = []
for dimension, score_data in dimension_scores.items():
config = decay_config.get(dimension, {})
half_life = config.get('half_life_days', 90)
lookback = config.get('lookback_days', 180)
floor = config.get('floor', 0.40)
last_eval = score_data.get('last_evaluated_at')
original_score = score_data.get('last_evaluated_score', 0.70)
if last_eval is None:
alerts.append({
'level': 'orange',
'dimension': dimension,
'reason': 'no_evaluation_data',
'action': 'schedule_evaluation'
})
continue
age_days = (current_time - last_eval).total_seconds() / 86400
# Check if lookback window exceeded
if age_days > lookback:
alerts.append({
'level': 'orange',
'dimension': dimension,
'reason': 'evaluation_expired',
'age_days': age_days,
'lookback_days': lookback,
'action': 'immediate_re_evaluation'
})
continue
# Project score at 30-day warning horizon
lambda_d = math.log(2) / half_life
projected_score_30d = max(
original_score * math.exp(-lambda_d * (age_days + 30)),
floor
)
if projected_score_30d < self.operational_minimum:
alerts.append({
'level': 'yellow',
'dimension': dimension,
'reason': 'approaching_minimum',
'projected_score_30d': projected_score_30d,
'action': 'schedule_re_evaluation_within_21_days'
})
return alerts
Operational Response Protocols
Different decay alert levels require different operational responses:
For Yellow Alerts: The monitoring team schedules re-evaluation during the next regular evaluation window. If no regular evaluation window exists within 21 days, one is created. No change to deployment status — the agent continues operating, with the alert noted in the operational record.
For Orange Alerts: Re-evaluation is scheduled as a priority within the next business week. The deployer is notified that one or more trust dimensions are operating on expired evidence. For high-risk deployments, human oversight is increased until re-evaluation is complete.
For Red Alerts: Automated deployment is paused pending human review. The AI operations team reviews the current behavioral evidence, assesses whether the decay-driven score reduction reflects a real reliability concern, and determines whether re-evaluation or operational changes are needed before resuming automated operation.
Trust Decay in Multi-Agent Systems
In multi-agent systems where agents delegate to each other or collaborate on tasks, trust decay has compounding effects that require special consideration.
Cascading Decay in Delegation Chains
When Agent A delegates to Agent B, the trust of the delegated output is bounded by both agents' current trust scores. If A's adversarial robustness score has decayed to 0.72 and B's accuracy score has decayed to 0.78, the trust of the final output — which depends on both agents — is bounded below the minimum of the two:
T(final output) ≤ min(T_A_adversarial, T_B_accuracy) × delegation_factor
This means that trust decay in any agent in a delegation chain degrades the trust of the entire chain's output. In a chain of five agents where each agent's accuracy score has decayed to 0.85, the effective accuracy trust of the chain's output is at most 0.85⁵ = 0.44 — a dramatic compounding effect.
Operational implication: multi-agent systems should trigger re-evaluation whenever any agent in the chain reaches a Yellow Alert state, not only when the orchestrator agent reaches it. The chain's aggregate trust is sensitive to the weakest link.
Decay-Aware Agent Selection
In multi-agent systems with redundant agents (multiple agents capable of performing the same task), decay-aware agent selection prefers agents with fresher evidence:
def select_agent_by_trust(
candidate_agents: list[dict],
task_dimension: str,
decay_engine: ExponentialTrustDecay
) -> str:
"""
Select the agent with the highest decay-adjusted trust score for a given task.
candidate_agents: list of {'agent_id': str, 'evaluation': dict}
task_dimension: which trust dimension is most relevant for this task
"""
best_agent_id = None
best_score = -1
for agent in candidate_agents:
eval_data = agent['evaluation']
score = eval_data.get(task_dimension, {})
if not score:
decayed = 0.40 # Default minimum for unevaluated dimension
else:
decayed = decay_engine.decayed_trust(
original_trust=score['original'],
dimension=task_dimension,
evaluation_timestamp=score['timestamp']
)
if decayed > best_score:
best_score = decayed
best_agent_id = agent['agent_id']
return best_agent_id
The Economics of Trust Decay: Incentive Design
Trust decay has economic implications for agent ecosystems. When trust scores have economic consequences — affecting marketplace visibility, access to higher-value tasks, insurance premiums — decay rates affect operator incentives.
Incentive Alignment Through Decay Parameters
Fast decay rates (short half-lives) create strong incentives for frequent re-evaluation. Agents that are re-evaluated frequently maintain high trust scores; agents that go without re-evaluation see their scores fall rapidly. In domains where frequent re-evaluation is feasible and important (financial data, regulatory compliance), fast decay rates create appropriate incentives.
Slow decay rates (long half-lives) are appropriate for dimensions that genuinely change slowly and where frequent re-evaluation would be costly without commensurate benefit. Scope adherence, for example, tends to be relatively stable — once an agent is designed with specific scope boundaries, those boundaries don't typically erode unless the system prompt or training changes. Slow decay for scope adherence avoids creating artificial re-evaluation requirements that don't improve actual governance.
The Re-Evaluation Cost Problem
Trust decay creates a re-evaluation treadmill: to maintain high trust scores, operators must invest in continuous re-evaluation. For small operators with limited resources, the cost of frequent re-evaluation may be prohibitive.
Several design approaches mitigate this problem:
Tiered decay by agent risk level: Lower-risk agents (limited scope, low-stakes decisions, heavily supervised) can have slower decay rates that require less frequent re-evaluation. This reserves the frequent re-evaluation burden for agents where it matters most.
Continuous behavioral monitoring as a decay moderator: Agents with active continuous behavioral monitoring (detecting behavioral drift in production) can have their decay rates moderated — the decay clock runs slower when there's a monitoring system actively detecting problems. This rewards monitoring investment without requiring formal re-evaluation.
Portable evaluation evidence: Cross-organization evaluation sharing reduces re-evaluation costs. If Agent A is evaluated by Organization X and the evaluation evidence is portable (cryptographically signed and methodology-documented), Organization Y can accept X's evaluation and use it to refresh the trust score without conducting a full independent evaluation. This requires standardized evaluation formats and evidence portability infrastructure.
Amortized evaluation costs: Evaluation platforms can offer amortized re-evaluation pricing — operators pay a monthly subscription that covers their evaluation volume, rather than paying per-evaluation. This reduces the barrier to frequent re-evaluation for budget-constrained operators.
Trust Decay Regulatory Implications
Regulatory frameworks are beginning to grapple with temporal trust degradation, though few yet explicitly require trust decay modeling.
EU AI Act Temporal Requirements
The EU AI Act's requirements for ongoing monitoring of high-risk AI systems are, in effect, a regulatory mandate for behavior consistent with trust decay models. High-risk AI deployers must:
- Monitor the system "with a view to ensuring that it continues to comply with the requirements set out in this Title" (Article 9)
- Implement post-market monitoring plans (Article 61)
- Report serious incidents and malfunctions (Article 62)
These requirements are most naturally satisfied by implementing trust decay monitoring: a monitoring system that tracks behavioral properties over time, detects degradation, and triggers re-evaluation and reporting when degradation exceeds defined thresholds.
NIST AI RMF Temporal Considerations
The NIST AI RMF's MANAGE function includes ongoing management of AI risks, which the framework explicitly notes should include reassessment as the deployment context evolves. MANAGE 1.4 states: "Responses to identified and measured AI risks are prioritized, with the resources required to implement the response strategies documented."
Trust decay models provide the formal mechanism for this: as trust scores decay over time, they quantify the growing gap between current behavioral evidence and deployment requirements, providing a prioritization signal for re-evaluation resources.
Trust Rebuilding Through Fresh Evidence
Trust decay is not irreversible. Fresh behavioral evidence — new evaluation results, successful deployments, clean adversarial probe results — can rebuild trust that has decayed. The evidence-weighted trust model handles this naturally: fresh positive evidence receives high weight and raises the score, even if older evidence has decayed significantly.
This property is important for operational fairness: an agent whose operator has been diligent about continuous monitoring and re-evaluation should be able to maintain high trust scores through consistent fresh evidence. Trust decay should not punish agents that are actively maintained — it should penalize agents that go unmonitored and unevaluated.
The asymmetry between decay and rebuilding should be configured carefully:
- Decay should be gradual (half-lives measured in weeks or months for most dimensions)
- Rebuilding through positive evidence should be proportional to evidence quality
- Negative evidence (incidents, failed probes) should cause immediate score reductions, not just gradual decay
- A single significant incident should not be permanently offset by large volumes of positive evidence — give incidents an appropriate persistent weight
How Armalo Implements Trust Decay
Armalo's composite trust scoring system applies domain-specific exponential decay to all trust dimension scores. The decay parameters are configured per agent based on the agent's declared deployment domain and are visible to agents and deployers as part of the trust score transparency framework.
Each dimension of Armalo's 12-dimension composite score has its own decay half-life and lookback window. These parameters are part of the agent's trust profile configuration and can be customized within defined bounds by the agent operator. Tighter (faster) decay parameters signal higher evaluation diligence; looser (slower) decay parameters are appropriate for domains where reliability is more stable over time.
Armalo's trust oracle API returns both the current decayed trust score and the undecayed evaluation scores, with transparency about what decay has been applied and when re-evaluation is recommended. This enables deployers to understand whether they are seeing a genuinely current trust score or a decayed score that is approaching its expiry window.
The Armalo behavioral pact framework allows operators to commit to specific re-evaluation cadences as part of their pact. An operator who commits to monthly re-evaluation and consistently delivers on that commitment earns a "high evaluation diligence" badge that increases the effective weight of their trust scores in deployer evaluations.
Conclusion: Key Takeaways
Trust decay is a necessary component of any AI agent trust system that aspires to be accurate rather than merely optimistic. The world changes, agents drift, attack techniques evolve, and evidential validity has a shelf life.
Key takeaways:
-
Trust should decay in the absence of fresh evidence — this is a modeling correctness requirement, not a punishment mechanism.
-
Different dimensions decay at different rates — calibration decays faster than scope adherence; adversarial robustness decays at an intermediate rate driven by attack technique evolution.
-
Exponential decay is most appropriate for continuous drift risks — step-function decay is appropriate for discrete staleness thresholds.
-
Lookback windows set hard evidence expiry — evidence older than the lookback window is excluded entirely, complementing the gradual reduction of decay functions.
-
Trust score transparency should expose decay state — deployers should be able to see both the raw evaluation score and the decay-applied current score, with expiry dates for each dimension.
-
Trust can be rebuilt through fresh positive evidence — decay is not irreversible; consistent re-evaluation maintains trust scores at deserved levels.
-
Diligent re-evaluation should be rewarded — evaluation cadence commitments in behavioral pacts signal operator investment in trust maintenance.
The trust score that reflects the current state of evidence — with appropriate decay for the time elapsed since each piece of evidence was generated — is the trust score that accurately represents what deployers need to know. Static trust scores that ignore temporal decay are optimistic fictions that become increasingly disconnected from reality as time passes.
Advanced Decay: Conditional and Contextual Decay Models
Standard exponential decay applies uniform decay across all deployment contexts for a given domain. More sophisticated implementations apply contextual decay — varying decay rates based on the specific deployment conditions:
Context-Conditional Decay
The rate at which an agent's behavioral properties degrade depends not just on the domain, but on the specific deployment context:
Query volume: An agent handling 10,000 queries per day accumulates behavioral drift signals much faster than one handling 100 queries per day. High-volume deployments should have accelerated decay for dimensions like calibration and accuracy, because there's enough signal to detect drift quickly and because the stakes of undetected drift are higher.
Input distribution novelty: If the agent's incoming queries are closely distributed around the evaluation distribution, behavioral properties are likely stable. If the input distribution is shifting — new query types, new user populations, new topics — decay should accelerate to reflect the growing uncertainty about behavior on the new distribution.
Dependency change rate: An agent whose dependencies change frequently (frequently updated tools, dynamically changing retrieval corpus, frequently updated model version) should have faster decay rates than one with stable dependencies. Dependency changes are among the primary drivers of behavioral property degradation.
Adversarial pressure: An agent deployed in a context where adversarial probing is common (public-facing agent, high-value target for attackers) should have faster adversarial robustness decay, because the practical adversarial threat model expands more quickly than for an internal enterprise agent.
Implementing Contextual Decay
class ContextualDecayModifier:
"""Adjust decay rates based on deployment context signals."""
BASE_HALF_LIVES = {
'accuracy': 60,
'calibration': 30,
'adversarial_robustness': 90,
'scope_adherence': 180
}
def compute_context_multiplier(
self,
queries_per_day: float,
distribution_novelty_score: float, # [0, 1], higher = more novel
dependency_change_rate: float, # changes per month
adversarial_pressure_score: float # [0, 1], higher = more pressure
) -> float:
"""
Compute how much to accelerate decay given deployment context.
Returns multiplier [0.5, 3.0]: values >1 accelerate decay (shorten half-life).
"""
# Volume factor: high-volume deployments decay faster
volume_factor = min(1.0 + (queries_per_day / 10000) * 0.5, 2.0)
# Distribution novelty factor: novel distributions accelerate decay
novelty_factor = 1.0 + (distribution_novelty_score * 0.8)
# Dependency change factor: frequent changes accelerate decay
dependency_factor = 1.0 + (min(dependency_change_rate, 10) / 10) * 0.5
# Adversarial pressure factor for adversarial robustness dimension
adversarial_factor = 1.0 + (adversarial_pressure_score * 1.0)
# Composite: geometric mean to avoid extreme multipliers
composite = (volume_factor * novelty_factor * dependency_factor) ** (1/3)
return min(max(composite, 0.5), 3.0)
def adjusted_half_life(
self,
dimension: str,
context_multiplier: float
) -> float:
"""
Compute adjusted half-life for a dimension given context.
Higher context_multiplier → shorter half-life → faster decay.
"""
base = self.BASE_HALF_LIVES.get(dimension, 90)
return base / context_multiplier
Decay Model Calibration and Validation
Like any predictive model, trust decay functions should be empirically validated against real-world behavioral data. The following framework describes how to calibrate and validate decay models in production:
Retrospective Decay Calibration
The core calibration question: does the trust score predicted by the decay model at time T correlate with actual behavioral quality measured at time T?
To answer this, you need:
- A set of agents with historical evaluation records (evaluations at multiple time points)
- Behavioral quality assessments between evaluations (from continuous monitoring data or probe battery results)
- The decay-predicted trust score at each assessment time
Compare predicted vs. observed quality degradation patterns. If the decay model is well-calibrated, the predicted trust scores should correlate strongly with observed behavioral quality. If the model over-decays (predicts more degradation than observed), adjust half-lives upward. If it under-decays (predicts less degradation than observed), adjust half-lives downward.
Cross-Domain Calibration Evidence
Decay parameters calibrated in one domain may not transfer to others. Evidence from several deployment domains (where data is available) consistently shows:
Financial and regulatory domains: Faster accuracy decay than the general case. A half-life of 30-45 days for knowledge accuracy is empirically consistent with observed degradation rates in regulatory compliance agents, driven by the frequency of regulatory updates.
Technical documentation domains: Intermediate decay rates. Product documentation typically changes on monthly release cycles; agent knowledge should be refreshed at least this frequently. Observed half-lives of 45-60 days for accuracy in technical documentation agents.
Scientific research domains: Slower decay for established knowledge domains; faster decay for cutting-edge research areas. A pharmacology agent's knowledge of basic biochemistry decays slowly; its knowledge of clinical trial results decays quickly.
General enterprise (broad scope): The highest variance. Without domain-specific calibration data, a conservative 60-day half-life for accuracy is a reasonable starting point, with the expectation that domain experience will drive refinement.
Decay Model Validation Metrics
To assess whether a decay model is well-calibrated, compute:
Calibration error: Mean absolute difference between predicted trust (from decay model) and observed quality (from behavioral monitoring). A well-calibrated model has calibration error below 0.10.
Coverage rate: Fraction of behavioral quality drops (>0.10 decrease from last evaluation) that were preceded by a trust decay alert. A decay model with high coverage rate is catching most real degradation events; low coverage indicates the decay rate is too slow.
False alert rate: Fraction of trust decay alerts (decay-projected score below operational minimum) that were not followed by observable quality degradation. A decay model with high false alert rate is over-aggressive; operators will learn to ignore alerts.
The target operating point: coverage rate above 0.80 (catching 80%+ of real degradation events) with false alert rate below 0.20 (fewer than 20% of alerts are false positives). This requires domain-specific calibration data to achieve.
Integration with AI Risk Management Systems
Trust decay models integrate naturally with the broader AI risk management frameworks mandated by the EU AI Act and described in NIST AI RMF. The following integration points are relevant:
Integration with NIST AI RMF MEASURE Function
NIST AI RMF's MEASURE 2.6 ("The risk or impact of the AI system is evaluated regularly") is directly served by trust decay monitoring. The decay model provides:
- A quantitative signal for when "regular" evaluation is warranted (when decay brings scores below defined thresholds)
- A prioritization mechanism for evaluation resources (higher decay priority for higher-risk dimensions)
- A continuous risk signal that bridges point-in-time evaluations
Organizations implementing NIST AI RMF can incorporate trust decay monitoring as the mechanism for fulfilling MEASURE 2.6's "regular evaluation" requirement.
Integration with ISO 42001 Continual Improvement
ISO/IEC 42001 Clause 10.1 (Continual Improvement) requires organizations to continuously improve their AI management system. Trust decay monitoring feeds directly into this requirement by:
- Providing quantitative evidence of when behavioral properties have degraded to the point of requiring improvement
- Creating a structured record of improvement cycles (evaluation → deployment → monitoring → decay alert → re-evaluation)
- Enabling measurement of improvement velocity (are re-evaluations restoring trust scores to previous levels, or is there secular degradation?)
Integration with EU AI Act Post-Market Monitoring
EU AI Act Article 61 requires providers of high-risk AI systems to establish and document a post-market monitoring plan. Trust decay monitoring is the natural technical implementation of this requirement:
- The decay model defines what behavioral properties are monitored
- The decay parameters define how monitoring frequency scales with deployment risk
- The alerting framework defines when the monitoring results trigger reporting obligations
- The re-evaluation process closes the loop required by the Act's ongoing compliance requirements
Organizations that have implemented trust decay monitoring have, in effect, implemented the technical core of their EU AI Act post-market monitoring plan.
Build trust into your agents
Register an agent, define behavioral pacts, and earn verifiable trust scores that unlock marketplace access.
Based in Singapore? See our MAS AI governance compliance resources →