Loading...
Curated Collection
Implementation-oriented posts for builders and operators.
Topics: mcp-security · persistent-memory · runtime-governance
24 metadata-matched posts in this path
The move toward OS-level agent workspaces changes the security conversation: the boundary is no longer just the model, it is the workspace around action.
AI teams are accumulating permission debt every time an agent keeps access after its evidence, scope, owner, model, or tool boundary changes.
Research agents are getting good at finding papers and market signals. The frontier is deciding which findings deserve experiments, writebacks, or product changes.
A swarm can pass every individual agent eval and still fail when trust, memory, instructions, or tool outputs cascade across agents.
MCP and tool protocols are making action easier. That makes tool governance the border-control layer for agents that touch data, money, code, and customer systems.
Search agents and dashboards make background monitoring mainstream. The missing control is freshness, source policy, and escalation discipline.
When websites expose tools to browser agents, trust moves from page content to tool manifests, side-effect labels, and receipts.
Always-on agents need more than recurring task schedules. They need proof budgets that define how much evidence must exist before action expands.
Indirect prompt injection is usually framed as input filtering. For consequential agents, it is a planning and authority failure.
Every autonomous workflow should have a blast-radius budget: a bounded definition of how much money, data, customer impact, and authority it can risk before review.
Managed agent environments reduce operational friction, but they do not answer whether the agent deserves more authority after the run.
Antigravity-style coding agents make multi-agent development normal. The missing layer is consequence-aware promotion from code to authority.
MCP, A2A, ANP, and related protocols are moving faster than the trust models around them. The window to shape secure defaults is now.
WebMCP is exciting because it gives browser agents structured tools. It is risky because side effects become easier to hide behind normal UI actions.
AI-agent governance is too focused on launch. The bigger operational risk is what remains after an agent changes roles, loses trust, or leaves a workflow.
A PDF describing how an agent should behave is not a pact. It is a wish. Pacts are signed cryptographic commitments enforced at runtime, and that distinction decides whether your agent economy has teeth or vibes.
A behavioral pact is not a terms-of-service document or a capability description. It is a machine-readable specification of what an agent will and will not do — the operational contract that makes deployment accountable. Here is how to write one that actually works.
The fastest way to lose authority after a major platform event is to overclaim. The better move is explicit claim status, evidence, and experiments.
Agent evaluations are often treated as durable proof, but a model switch can invalidate the behavioral evidence behind permissions, scores, and buyer trust.
Search agents turn monitoring into a background product primitive. The trust question is whether every alert can prove source freshness and action relevance.
The scary memory attack is not always a single jailbreak. It is a normal-looking sequence of conversations that slowly changes what an agent believes it is allowed to do.
Enterprise agent memory becomes dangerous when teams cannot prove where a useful belief came from, who trusted it, and when it stopped being true.
The hardest problem in AI agent accountability is not detecting when an agent cheats — it is building an agent that can prove it did not. Verifiable behavioral records require cryptographic attestation, not just logging.
AI agents confabulate. They produce fluent, confident-sounding outputs that are factually wrong. In a demo, this is embarrassing. In a customer conversation, a financial analysis, or a compliance review, it is a structural risk that requires architectural solutions, not prompting workarounds.