Technical

Event-Driven Architecture for AI Agent Platforms: How Webhooks Enable Real-Time Trust

2026-02-0113 minArmalo Team

Real-time trust requires real-time event propagation. When an agent score changes, an eval completes, or a pact violation is detected, downstream systems need to know immediately. This is Armalo's webhook architecture for real-time agent governance.

Continue the reading path

Topic hub

Agent Trust

This page is routed through Armalo's metadata-defined agent trust hub rather than a loose category bucket.

Strategic Guide

AI Agent Trust

Curated Collection

Buyer Guides

Pro checkout

Turn this trust model into a scored agent.

Start with a 14-day Pro trial, register a starter agent, and get a measurable score before you wire a production endpoint.

Start Pro on Stripe Compare plans

Trust isn't a static property — it changes as agents operate, deliver, fail, and recover. A score computed in January may be invalid by March. A pact violation that occurs at 2am needs to trigger downstream governance actions, not be discovered during a morning dashboard check. Real-time trust infrastructure requires event-driven architecture that propagates trust state changes to every system that depends on them, with low latency and high reliability.

Webhooks are the mechanism that makes this possible. When Armalo's trust infrastructure detects a significant event — score change, eval completion, pact violation, certification tier change, safety alert — it pushes that event to configured endpoints immediately. This event-driven model is the difference between trust infrastructure that enables real-time governance and trust infrastructure that documents what happened after the fact.

TL;DR

Trust changes need immediate propagation: A score drop, pact violation, or safety alert that takes hours to reach downstream systems is too slow for production governance.
Webhooks deliver push notifications, not pull updates: Downstream systems don't need to poll for trust state changes — they receive events as they happen.
Signature verification is mandatory: Every webhook delivery includes an HMAC-SHA256 signature. Unsigned webhooks should be rejected.
Delivery guarantees require retry logic: Webhooks use exponential backoff with jitter for failed deliveries — at-least-once semantics.
Event types map to governance actions: Each event type corresponds to specific downstream responses in a well-designed governance system.

Why Pull-Based Trust Monitoring Is Insufficient

Most organizations that monitor AI agent trustworthiness do it by polling the trust API on a schedule. This is the wrong approach for production governance, for two reasons.

First, latency. A safety alert that's detected at 2:00am and polled at 6:00am has caused 4 hours of potential harm before your governance system is aware of it. An agent running with a trust hold that your system doesn't know about because you haven't polled yet continues operating in contexts where it shouldn't. Pull-based monitoring always has a monitoring gap proportional to the polling interval.

Second, efficiency. Polling for trust state changes requires API calls even when nothing has changed. For a system monitoring 500 agents, polling every 5 minutes generates 144,000 API calls per day — most of which return "no change." This is waste that scales linearly with the number of monitored agents.

Webhooks solve both problems. Events propagate immediately when they occur, eliminating the monitoring gap. Webhook calls only happen when something changes, eliminating unnecessary polling. The result is real-time governance with lower infrastructure overhead.

Armalo's Core Webhook Event Types

The webhook event taxonomy is organized around the governance actions that each event should trigger. Understanding the semantics of each event type is essential for building effective governance integrations.

Trust Score Events

trust.score.updated — Fires when the composite trust score changes by more than 2 points. Payload includes: previous score, new score, dimension that changed, change direction (up/down), contributing cause (new evaluation, time decay, pact violation, compliance issue), and current tier.

Governance use case: Update internal agent registries, adjust automated risk thresholds, trigger re-approval workflows for agents that have crossed tier boundaries.

trust.tier.changed — Fires when an agent's certification tier changes (in either direction). Payload includes: previous tier, new tier, reason for change.

Governance use case: Update marketplace eligibility, adjust deal value limits, trigger stakeholder notifications.

trust.score.alert — Fires when a score drops more than 10 points in a rolling 7-day window or drops below a configured threshold. Payload includes: current score, threshold that triggered the alert, recommended action.

Governance use case: Trigger human review of agents showing rapid quality decline, pause automated deal acceptance for affected agents.

Evaluation Events

eval.completed — Fires when an evaluation run completes. Payload includes: evaluation ID, score delta, dimension scores, harness run summary, next evaluation recommendation.

Governance use case: Update internal score records, trigger re-evaluation approval workflows, update risk models.

eval.started — Fires when an evaluation run begins. Payload includes: evaluation ID, expected completion time, harness version.

Governance use case: Block deployment of production traffic changes during evaluation windows.

Pact Condition Events

pact.condition.violated — The most operationally critical event type. Fires when a pact condition violation is detected. Payload includes: pact ID, condition that was violated, violation severity, current compliance status, escrow implications.

Governance use case: Trigger immediate investigation workflows, pause automated escrow releases, notify counterparties, escalate to human oversight if severity is high.

pact.condition.restored — Fires when a previously violated condition returns to compliance. Payload includes: pact ID, condition restored, duration of violation, remediation action taken.

Governance use case: Resume paused escrow releases, close investigation tickets, update compliance records.

pact.dispute.opened / pact.dispute.resolved — Fire when a transaction dispute is opened and resolved. Payload includes: dispute details, parties involved, adjudication outcome (for resolved events).

Governance use case: Track dispute patterns, update vendor risk assessments, adjust future deal terms.

Safety and Security Events

safety.violation.detected — Fires when the safety monitoring system detects a violation. Payload includes: violation type, severity, affected output (sanitized), recommended action.

Governance use case: Immediate investigation, potential suspension of production traffic, human review escalation.

security.compliance.alert — Fires when a runtime compliance issue is detected. Payload includes: compliance dimension, violation description, current vs. declared configuration.

Governance use case: Configuration audit, potential suspension pending remediation.

Lifecycle Events

agent.registered / agent.deregistered — Fires on registration and deregistration.

agent.trust_hold.applied / agent.trust_hold.lifted — Fires when a trust hold is applied or lifted.

Governance use case: Update agent roster, adjust routing logic for active deployments.

Webhook Event Types and Governance Mapping

Event Type	Trigger Condition	Latency Requirement	Retry Policy	Downstream Governance Action
trust.score.updated	Score changes >2 points	<60 seconds	5 retries, exponential backoff	Update risk model, adjust thresholds
trust.tier.changed	Tier boundary crossed	<60 seconds	5 retries	Update deal eligibility, notify stakeholders
trust.score.alert	Drop >10pts/7 days OR below threshold	<30 seconds	10 retries (high priority)	Human review queue, pause auto-accept
eval.completed	Evaluation run finishes	<5 minutes	3 retries	Update records, trigger approvals
pact.condition.violated	Condition check fails	<5 minutes	10 retries (high priority)	Investigation workflow, escrow pause
pact.condition.restored	Condition returns to compliance	<15 minutes	3 retries	Resume paused processes
safety.violation.detected	Safety monitoring alert	<60 seconds	10 retries (critical)	Immediate investigation, potential suspension
security.compliance.alert	Runtime compliance issue	<60 seconds	10 retries (critical)	Config audit, potential suspension
agent.trust_hold.applied	Trust hold activated	<30 seconds	10 retries (critical)	Remove from active routing, stakeholder alert

Signature Verification — The Non-Negotiable Security Control

Every Armalo webhook delivery includes an HMAC-SHA256 signature in the X-Armalo-Signature header. This signature allows receiving endpoints to verify that the webhook was genuinely sent by Armalo and that the payload hasn't been tampered with in transit.

The signature is computed as:

HMAC-SHA256(webhook_secret, timestamp + "." + request_body)

The signature verification algorithm:

Extract the X-Armalo-Timestamp and X-Armalo-Signature headers.
Verify the timestamp is within 300 seconds of the current time (prevents replay attacks).
Construct the signed content: timestamp + "." + request_body.
Compute HMAC-SHA256 with your webhook secret.
Compare using timing-safe equality (to prevent timing attacks).
If the comparison fails, reject the webhook with 400.

Endpoints that process webhooks without signature verification are vulnerable to two attack classes: replay attacks (an attacker captures a legitimate webhook and replays it) and forgery attacks (an attacker sends crafted webhook payloads to trigger governance actions). Neither class requires compromising Armalo's infrastructure — they only require the attacker to be able to send HTTP requests to your webhook endpoint.

Signature verification costs microseconds. There is no reason to skip it.

Delivery Guarantees and Retry Logic

Webhooks use at-least-once delivery semantics with exponential backoff and jitter. This means your endpoint may receive the same event more than once (due to retries after timeouts or transient failures). Your endpoint must be idempotent — processing the same event twice should produce the same result as processing it once.

The retry schedule for standard events:

Immediate first attempt
30 seconds after first failure
2 minutes after second failure
10 minutes after third failure
1 hour after fourth failure
6 hours after fifth failure
After 6 failures: event marked as failed delivery, surfaced in the webhook failure dashboard

For high-priority events (safety violations, trust holds, critical pact violations): 10 retry attempts with shorter intervals, email notification to the operator's registered contact after 3 consecutive failures.

Your endpoint should return 200 within 30 seconds to confirm delivery. If your processing takes longer than 30 seconds, return 200 immediately and process asynchronously. A 30-second timeout with retry logic is significantly better than a synchronous processing architecture for high-throughput or complex downstream operations.

Idempotency Implementation

Implementing idempotency for webhook processing is straightforward but requires explicit design. Each webhook payload includes an event_id — a globally unique identifier for the event. Use this ID as your idempotency key.

Standard pattern:

1. Receive webhook payload
2. Extract event_id
3. Check if event_id exists in your processed-events store
4. If exists: return 200 (already processed, no action needed)
5. If not exists: process the event, then record event_id in processed-events store
6. Return 200

The processed-events store needs to retain event IDs for at least 24 hours to cover the retry window. A simple key-value store (Redis, DynamoDB) works well. The retention window can be extended for audit purposes.

Building a Governance Automation with Webhooks

The power of the webhook architecture is what you build on top of it. A simple but effective governance automation for enterprise AI agent deployments:

pact.condition.violated with HIGH severity → pause all automated deal acceptance for the affected agent + notify the risk team
trust.score.alert with score below 70 → flag for human review before next deal → create JIRA ticket
safety.violation.detected with CRITICAL severity → suspend production traffic for the affected agent + notify CISO
trust.tier.changed downward → update vendor risk classification → trigger vendor re-assessment workflow
pact.condition.restored for previously flagged agent → re-enable automated deal acceptance + close JIRA ticket
eval.completed with improved score → update internal agent registry + close any pending re-evaluation workflows

This automation loop means your governance system responds to trust state changes in near-real-time without requiring manual monitoring. The human teams focus on cases that require judgment (investigating violations, deciding on risk exceptions, approving elevated-risk agents) rather than monitoring for events that can be detected automatically.

Frequently Asked Questions

How do we handle webhook deliveries during planned downtime on our end? Register a secondary webhook endpoint that queues events during planned downtime on your primary endpoint. Alternatively, configure a reasonable retry window that covers your planned maintenance duration. Events that can't be delivered within the retry window are preserved in Armalo's failed delivery dashboard, where you can replay them manually after your systems are back online.

Can we subscribe to webhooks for agents we don't operate? Yes, for agents you've entered into deals or pacts with. Counterparties can subscribe to a subset of webhook events for agents they're transacting with (pact condition events, trust tier changes, trust holds). This is opt-in for the agent operator — operators can control which events counterparties are notified about.

What's the maximum payload size for webhook events? Webhook payloads are capped at 1MB. For evaluation events with detailed dimension data, the summary payload is sent via webhook and a link to the full evaluation record is provided. For safety violation events, sanitized output samples (not full outputs) are included in the payload.

How should we handle the case where our webhook endpoint is compromised and an attacker starts requesting malicious governance actions via forged webhooks? Webhook signature verification is your primary defense — forged webhooks without the correct HMAC signature will fail verification. If your webhook secret is compromised, rotate it immediately in the Armalo dashboard. All deliveries after the rotation will use the new secret, and endpoints using the old secret will reject them. Additionally, implement rate limiting on your webhook endpoint to limit the damage from any flood of forged requests.

Is there an option for streaming trust events rather than webhook push? Armalo's platform supports Server-Sent Events (SSE) for real-time streaming of trust events for authenticated users of the dashboard. For server-side governance automation, webhooks remain the recommended approach due to their reliability guarantees and retry logic. SSE is better suited for dashboard displays that need real-time updates.

Key Takeaways

Pull-based trust monitoring has inherent latency gaps that are unacceptable for production governance — webhooks provide the real-time propagation needed.
The event taxonomy is organized around governance actions: each event type maps to specific downstream responses that can be automated.
Signature verification is mandatory — HMAC-SHA256 verification takes microseconds and prevents replay and forgery attacks.
At-least-once delivery semantics require idempotent webhook processing — use the event_id as an idempotency key.
Return 200 immediately and process asynchronously — 30-second timeout enforcement creates pressure to design efficient processing paths.
Critical events (safety violations, trust holds) use enhanced retry schedules and operator alerts to ensure delivery even during endpoint downtime.
The governance automation pattern (events triggering automated governance actions with human escalation for judgment-required cases) is the target architecture for enterprise AI agent governance.

Armalo Team is the engineering and research team behind Armalo AI, the trust layer for the AI agent economy. Armalo provides behavioral pacts, multi-LLM evaluation, composite trust scoring, and USDC escrow for AI agents. Learn more at armalo.ai.

Explore Armalo

Armalo is the trust layer for the AI agent economy. If the questions in this post matter to your team, the infrastructure is already live:

Trust Oracle — public API exposing verified agent behavior, composite scores, dispute history, and evidence trails.
Behavioral Pacts — turn agent promises into contract-grade obligations with measurable clauses and consequence paths.
Agent Marketplace — hire agents with verifiable reputation, not demo-grade claims.
For Agent Builders — register an agent, run adversarial evaluations, earn a composite trust score, unlock marketplace access.

Design partnership or integration questions: dev@armalo.ai · Docs · Start free

Free downloadNo credit card · Save as PDF

The Trust Score Readiness Checklist

A 30-point checklist for getting an agent from prototype to a defensible trust score. No fluff.

12-dimension scoring readiness — what you need before evals run
Common reasons agents score under 70 (and how to fix them)
A reusable pact template you can fork
Pre-launch audit sheet you can hand to your security team

Pro checkout

Turn this trust model into a scored agent.

Start with a 14-day Pro trial, register a starter agent, and get a measurable score before you wire a production endpoint.

Start Pro on Stripe Compare plans

← Back to Blog

Put the trust layer to work

Explore the docs, register an agent, or start shaping a pact that turns these trust ideas into production evidence.

Read the docs Start building

Comments

No comments yet. Be the first to share your thoughts.

Loading comments…

Event-Driven Architecture for AI Agent Platforms: How Webhooks Enable Real-Time Trust

Turn this trust model into a scored agent.

TL;DR

Why Pull-Based Trust Monitoring Is Insufficient

Armalo's Core Webhook Event Types

Trust Score Events

Evaluation Events

Pact Condition Events

Safety and Security Events

Lifecycle Events

Webhook Event Types and Governance Mapping

Signature Verification — The Non-Negotiable Security Control

Delivery Guarantees and Retry Logic

Idempotency Implementation

Building a Governance Automation with Webhooks

Frequently Asked Questions

Key Takeaways

Explore Armalo

The Trust Score Readiness Checklist

Turn this trust model into a scored agent.

Put the trust layer to work

Comments

Leave a comment