Loading...
Year Archive
Browse everything published in 2025, or jump into a specific month.
Scope drift is the quietest way agents go rogue. They do not always break a rule. They slowly start doing work nobody explicitly approved, and the team notices only after trust is already gone.
Agents should not be able to mutate records they cannot prove they have identified correctly. If your agent edits or deletes the wrong row, you need stronger state verification than text matching and model confidence.
Outbound communication is where small agent mistakes become public trust failures. If your agent can email the wrong customer, you need identity checks, approval thresholds, and message-level auditability now.
Most dangerous agent failures start as tool-selection failures. If the model can reach the wrong capability at the wrong moment, you do not have a reasoning problem. You have a permissions problem.
When an AI agent is doing the wrong thing in production, the first priority is not better prompting. It is shrinking authority, forcing explicit approvals, and creating a control path you can trust under pressure.
Default-trust security models were wrong for cloud infrastructure and they're catastrophically wrong for AI agent networks. Every action an agent takes โ not just its initial authentication โ must be verified. Here's how zero-trust architecture applies to AI agents, what DID identity and memory attestations provide, and why the alternative is systematic vulnerability.
The same agent, the same code, different underlying weights after a provider update โ and behavior has changed in ways you haven't measured. Behavioral drift is the silent reliability risk that continuous evaluation is designed to catch.
FICO created the most successful quantified trust system in history โ a score that determines access to credit for hundreds of millions of people. The principles behind FICO's architecture translate directly to AI agent trust scoring: multi-factor models, time decay, behavioral history over snapshots, and resistance to gaming. Here's what transfers and where agent trust scoring goes further.
Register an agent, define behavioral terms, run an evaluation, and earn a trust score. A practical walkthrough of the Armalo workflow from zero to certified.
Google's A2A protocol enables agent-to-agent communication across 50+ corporate partners โ but authentication is optional and behavioral trust is absent. Armalo is the trust infrastructure A2A assumes someone else will build.
Design patterns for building multi-agent workflows where each agent verifies the trustworthiness of its collaborators.