Sandboxing AI Agents in Production: Runtime Isolation Strategies That Actually Work

2026-05-1014 min read

A deep technical guide to sandboxing AI agent tool execution in production: gVisor, Firecracker MicroVMs, WebAssembly sandboxing, seccomp syscall filtering, network namespace isolation, and container hardening — with real performance overhead data.

Sandboxing AI Agents in Production: Runtime Isolation Strategies That Actually Work

AI agents that execute code, run shell commands, query databases, or call external APIs need runtime isolation. Without it, a single successful prompt injection can escalate to OS-level code execution on the host system. This is not a theoretical risk — it has been demonstrated in controlled research environments and exploited in production deployments.

The problem is that most organizations know they need sandboxing but implement it incorrectly. A Docker container is not a sandbox. A virtual machine is not inherently well-configured. Seccomp filtering with a permissive profile is not meaningful isolation. The surface between the words "we sandbox our agents" and "we have effective runtime isolation" is enormous, and most organizations have not crossed it.

This document provides the definitive technical reference for AI agent sandboxing in production. We cover the technology stack from operating-system primitives up through higher-level isolation architectures, with performance overhead analysis for each approach, failure modes, and the specific conditions under which each technique is appropriate.

TL;DR

Docker containers are not sandboxes — container escapes are routine because containers share the host kernel. Effective sandboxing requires kernel isolation.
Three production-grade isolation technologies for AI agent tool execution: gVisor (kernel emulation), Firecracker MicroVMs (lightweight virtualization), and WebAssembly (capability-based sandboxing).
Seccomp-BPF syscall filtering is a necessary complement to all three — define the minimum syscall surface for each workload and enforce it.
Network namespace isolation is mandatory for any agent that can make external API calls — default-deny egress, allowlist-based access.
Filesystem mount restrictions (read-only root, tmpfs for writes, no host mounts) prevent the most common persistence mechanisms.
Performance overhead: gVisor ~2-10x syscall latency, Firecracker <5ms VM start time (cold), Wasm <1ms startup, comparable to native for compute-intensive workloads.
Every sandbox has failure modes — understand them, test against them, and layer defenses accordingly.

Why Docker Containers Are Not Sandboxes

The single most common misconception in AI agent security is that deploying agents in Docker containers constitutes sandboxing. It does not. Understanding why requires understanding the Docker isolation model.

Docker containers provide namespace isolation — processes in a container see an isolated view of the filesystem, network, process table, and user/group IDs. They do not provide kernel isolation. All containers on a host share the same Linux kernel.

This means that a process with sufficient privileges inside a container can interact with the shared kernel and potentially escape to the host. Container escape techniques are extensively documented and regularly demonstrated:

Privileged container escapes. A container run with --privileged has access to all Linux capabilities and can mount the host filesystem, manipulate kernel modules, and trivially escape to the host. Privilege containers are inappropriate for any agent workload — ever. Yet they appear in production AI agent deployments because they "make things work" during development.

Kernel CVE exploitation. Because containers share the host kernel, a kernel vulnerability is exploitable from within a container. The pace of kernel CVE publication means that unpatched hosts are routinely exploitable from containerized workloads.

runc and containerd vulnerabilities. The container runtime itself is an attack surface. CVE-2019-5736 (runc) allowed container escape via manipulation of the container initialization process. CVE-2020-15257 (containerd) allowed privilege escalation via the containerd API socket.

Volume mount exploitation. Containers with host filesystem mounts — even read-only mounts — create paths for information disclosure and, in some configurations, host modification.

Device access exploitation. Containers with access to host devices (/dev/sda, /dev/mem, /dev/kmem) can directly read and write to host storage or memory.

For AI agent deployments where tool execution can be influenced by attacker-controlled inputs, the container model is insufficient. The kernel isolation gap is the critical failure point.

Technology Option 1: gVisor

What it is: gVisor is an application kernel developed by Google that intercepts Linux system calls and implements them in a user-space Go process called the Sentry. Instead of passing system calls to the host kernel directly, container processes pass them to the Sentry, which either handles them entirely in user space or passes a filtered subset to the host kernel.

Why it matters for AI agent security: With gVisor, even if an agent execution environment is fully compromised, the attacker cannot escalate to the host kernel via system call exploitation. The Sentry mediates all kernel interactions, providing a fundamentally different isolation boundary than the standard container model.

Architecture components:

Sentry: The application kernel. Implements approximately 200 Linux syscalls in user space. Written in Go, which eliminates the memory safety vulnerabilities common in kernel C code.

Gofer: Handles filesystem operations. The Sentry communicates with the Gofer via a 9P protocol for filesystem access. This means filesystem access goes through an additional mediation layer.

Platform: The mechanism by which the Sentry intercepts system calls. gVisor supports two platforms: KVM (uses hardware virtualization for syscall interception, better performance) and ptrace (uses Linux ptrace for syscall interception, more portable but slower).

Integration with container runtimes: gVisor ships as runsc, an OCI-compatible container runtime that integrates with Docker and Kubernetes via the RuntimeClass API. From an operational perspective, deploying with gVisor requires changing the runtime from runc to runsc — no changes to container images or orchestration manifests.

Performance overhead analysis:

gVisor's performance overhead is concentrated in syscall-intensive operations. Representative measurements (Intel Xeon E5-2690, Linux 5.15, gVisor 20240101):

Operation	Native	gVisor (ptrace)	gVisor (KVM)	Overhead (KVM)
Syscall round-trip	0.3 μs	4.2 μs	1.1 μs	3.7x
File open()	2.1 μs	18.4 μs	7.3 μs	3.5x
Network connect()	12 μs	82 μs	41 μs	3.4x
malloc (1MB)	0.8 ms	1.2 ms	0.9 ms	1.1x
SHA-256 (1MB)	0.4 ms	0.41 ms	0.40 ms	1.0x

For AI agent tool execution — which is typically IO-bound (API calls, database queries) with minimal compute — the overhead is primarily in IO syscall latency. API calls with 100ms+ network round-trip times absorb gVisor's syscall overhead into measurement noise. Compute-intensive tools (code execution, data processing) run at near-native speed.

Failure modes:

Syscall compatibility gaps. gVisor implements approximately 200 of the ~300+ Linux syscalls. Tools that use unsupported syscalls will fail with ENOSYS. Review the gVisor compatibility matrix before deploying workloads.

Sentry process compromise. The Sentry is a complex Go application. A vulnerability in the Sentry itself could allow escape from the gVisor boundary. The Google Vulnerability Rewards Program offers bounties for gVisor escape techniques; the surface is actively monitored.

Platform-specific issues. gVisor with KVM requires nested virtualization in cloud environments (supported by AWS, GCP, Azure for specific instance types but not universally available). Fallback to ptrace significantly increases overhead.

Filesystem performance. The 9P Gofer introduces substantial overhead for filesystem-intensive operations. Applications that perform many small file reads/writes will see significant performance degradation. Mitigation: use tmpfs for in-container temporary storage.

Technology Option 2: Firecracker MicroVMs

What it is: Firecracker is a Virtual Machine Monitor (VMM) developed by AWS for Lambda and Fargate. It creates minimal virtual machines — MicroVMs — using Linux KVM, with a simplified device model and an API designed for programmatic VM lifecycle management.

Why it matters for AI agent security: Unlike gVisor (which shares the host kernel via a mediated interface), Firecracker creates VMs with their own kernel instances. The isolation boundary is hardware-level VM isolation, the same isolation that cloud providers use to separate customer workloads.

Architecture:

Firecracker VMs boot a minimal Linux kernel with a stripped-down device model — network interface, block device, serial port, balloon device. The VM has no emulation of complex hardware (BIOS, PCI bus, ACPI) that has historically been a rich attack surface in traditional hypervisors.

Each Firecracker VM runs in a dedicated firecracker process on the host. The attack path from VM to host requires exploiting the host kernel's KVM interface — a much more restricted surface than shared-kernel container escape paths.

Integration patterns for AI agents:

AI agent tool execution with Firecracker typically uses one of two patterns:

Per-invocation VMs. A Firecracker VM is started for each tool invocation and torn down after the tool completes. This provides maximum isolation — each invocation runs in a clean environment with no state from previous invocations. Cold start time with pre-warmed kernel snapshots is approximately 125ms.

Pool-based VMs. A pool of pre-started VMs waits for tool invocations. Invocations are routed to pool members, which are cleaned and returned to the pool after use (or torn down and replaced if cleaning is not sufficient). This reduces the per-invocation start time to <5ms for warm VMs.

Performance overhead analysis:

Firecracker's performance advantage over traditional hypervisors (QEMU/KVM) is significant. Its overhead compared to native container execution:

Metric	Docker (runc)	Firecracker (cold)	Firecracker (warm)
Start time	50-200ms	125ms	<5ms
Memory overhead per VM	~50MB (container overhead)	~5-15MB	~5-15MB
CPU overhead (steady state)	<1%	<2%	<2%
Network throughput	Near-native	Near-native	Near-native
Storage IOPS	Near-native	Near-native	Near-native

For AI agent tool execution, the dominant cost is typically the tool's API call latency (50-500ms), not the VM overhead. Warm Firecracker VMs add <5ms to each invocation — acceptable for virtually all tool types.

Snapshot-based acceleration:

Firecracker supports VM snapshots — serialized VM state including memory, CPU registers, and device state. A snapshotted VM can be restored to running state in <5ms. This is the technology that enables pool-based deployment with fast warm start times.

For AI agent tool execution, the typical pattern is:

Start a base Firecracker VM, install required tools, take a snapshot.
Pool manager maintains N running VMs restored from the snapshot.
Tool invocations are served by pool members; after each invocation, the VM is restored from snapshot (clearing all state) and returned to the pool.

Failure modes:

KVM vulnerability exploitation. The isolation boundary is the host kernel's KVM interface. KVM vulnerabilities do exist — CVE-2021-3653, CVE-2021-3656, CVE-2022-0070 are examples of KVM escape techniques. Host kernels must be patched promptly.

Firecracker VMM process compromise. The firecracker process running on the host is part of the attack surface. A vulnerability in the Firecracker VMM process could allow VM escape. Firecracker's simplified device model significantly reduces this surface compared to QEMU, but it is not zero.

Misconfigured network access. Firecracker VMs with improperly configured network access (too-permissive iptables rules, host-bridged networking) can access internal network services. Network isolation must be configured explicitly.

Technology Option 3: WebAssembly Sandboxing

What it is: WebAssembly (Wasm) is a binary instruction format designed for safe execution of arbitrary code. Its security model is capability-based — Wasm modules can only access resources (files, network, environment variables) that are explicitly granted to them at instantiation time.

Why it matters for AI agent security: Wasm provides a different isolation model than gVisor or Firecracker. Rather than isolating an existing OS process, Wasm executes code within a defined capability envelope. A tool implemented as a Wasm module cannot access any resource it was not explicitly granted — there is no kernel interface to exploit.

WebAssembly System Interface (WASI):

WASI defines a standard interface between Wasm modules and the host system — equivalent to POSIX for traditional applications. WASI capabilities are granted at the file-descriptor level for filesystem access and at the socket level for network access. A Wasm tool can only read files in paths it has been granted access to; it cannot enumerate the filesystem arbitrarily.

Runtimes for production use:

Wasmtime: The reference Wasm runtime from the Bytecode Alliance. Production-grade, actively maintained, used by Fastly's edge compute platform. Supports WASI and WASIp2 (the updated capability model).

Wasmer: An alternative Wasm runtime with additional platform support and plugin capabilities. Used in several production AI agent platforms.

wasm-micro-runtime (WAMR): Intel's ultra-lightweight Wasm runtime designed for resource-constrained environments. Startup time <1ms; memory overhead ~50KB per instance.

Performance characteristics:

Metric	Native	Wasmtime	WAMR
Startup time	Variable	<1ms	<1ms
Memory overhead	Variable	~2MB	~50KB
Compute throughput	1.0x	0.85-0.95x	0.80-0.90x
Syscall overhead	1.0x	1.1-1.3x	1.1-1.3x

Wasm overhead is consistently low for compute-bound workloads. The limitation is IO — not all WASI implementations have full networking support, and the capability model requires explicit grants for each resource, which adds coordination overhead.

Integration patterns for AI agent tools:

AI agent tools implemented as Wasm modules receive a capability set at invocation time: which files they can read, which files they can write, which network addresses they can connect to, which environment variables they can read. The agent runtime enforces these capabilities at the Wasm host layer — no Wasm syscall escape can exceed the declared capability envelope.

Failure modes:

Runtime vulnerabilities. Wasm runtimes have had CVEs — Wasmtime has had a handful of memory safety issues in the JIT compiler path. Runtime version pinning and prompt patching are required.

Capability over-grant. If the capability grants are too broad — filesystem access to /, unrestricted network access — the Wasm isolation model provides no benefit. Capability grants must be defined by the specific tool's requirements.

Language support limitations. Not all languages compile efficiently to Wasm. Python is particularly challenging — CPython can be compiled to Wasm but with significant overhead. Tools written in C, C++, Rust, Go, and AssemblyScript compile well.

Seccomp-BPF Syscall Filtering

Seccomp (Secure Computing Mode) is a Linux kernel facility for restricting which system calls a process can make. With the BPF (Berkeley Packet Filter) extension, seccomp allows arbitrary filter programs to be applied to syscall arguments, enabling fine-grained allow/deny decisions for every syscall.

Why it matters for AI agent security:

Even with gVisor or Firecracker, seccomp filtering provides an additional defense layer. gVisor's Sentry itself runs under seccomp restrictions. In container deployments without gVisor, seccomp filtering is one of the few meaningful kernel-level controls available.

Profile development methodology:

Developing an effective seccomp profile requires identifying the exact set of syscalls used by the agent tool workload. The correct approach:

Run the tool workload under strace -c to collect a syscall frequency count.
Add a safety margin — some syscalls appear only in error handling paths or infrequent code paths.
Start with the Docker default seccomp profile (which blocks ~44 high-risk syscalls) and refine from there.
Test the refined profile in staging before production deployment.

High-risk syscalls for AI agent workloads:

The following syscalls are disproportionately represented in container escape and privilege escalation techniques. They should be in every AI agent workload's deny list unless the tool has an explicit operational requirement:

ptrace — process tracing; rarely needed by agent tools, frequently used in escape techniques
mount — filesystem mounting; never needed by agent tools
pivot_root — filesystem root change; never needed
unshare — namespace creation; never needed
clone with CLONE_NEWUSER — user namespace creation; major privilege escalation vector
keyctl, add_key, request_key — kernel keyring access; not needed by agent tools
bpf — eBPF operations; not needed by most agent tools (and a significant attack surface)
perf_event_open — performance monitoring; not needed; has been used in side-channel attacks

Profile formats and tooling:

Docker accepts seccomp profiles as JSON files passed via --security-opt seccomp=/path/to/profile.json. Kubernetes accepts seccomp profiles via the SecurityContext annotation or the Seccomp type in pod spec.

Tools for seccomp profile generation:

syscall2seccomp — generates seccomp profiles from strace output
oci-seccomp-bpf-hook — generates profiles by monitoring actual container syscalls during a recording run
Falco — can be used to generate baseline profiles from runtime observation

Network Namespace Isolation

AI agents that can make external API calls require network isolation that limits what they can reach. Network namespaces provide process-level isolation of the network stack — processes in separate network namespaces see separate network interfaces, routing tables, and firewall rules.

For container deployments:

The simplest and most effective pattern is to run agent tool execution containers in a dedicated network namespace with:

No access to the host network stack
Controlled access to an egress proxy
The egress proxy enforcing an allowlist of permitted destinations

All outbound traffic from the agent tool container is routed through the egress proxy. The proxy's allowlist defines what the agent can reach. DNS queries are resolved by a DNS resolver under operator control — not a public resolver — to enable DNS-level filtering.

Egress allowlist construction:

Build the allowlist from your agent tool's declared external dependencies:

API endpoints the tool calls (specific hostnames, not IP ranges)
Package registries if the tool installs packages at runtime (pin specific versions to avoid surprise dependencies)
Data sources the tool is expected to query

Start with the minimal set and expand based on operational needs. Never use IP-based allowlisting for SaaS APIs — IP addresses change. Use hostname-based allowlisting with DNS resolution controlled by your proxy.

Default-deny firewall configuration:

# iptables rules for agent tool container network namespace
# Default deny all outbound
iptables -P OUTPUT DROP
# Allow established connections (responses to allowed outbound)
iptables -A OUTPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
# Allow DNS to controlled resolver only
iptables -A OUTPUT -d <resolver-ip> -p udp --dport 53 -j ACCEPT
# Allow outbound to egress proxy only
iptables -A OUTPUT -d <proxy-ip> -p tcp --dport 3128 -j ACCEPT
# Log everything else
iptables -A OUTPUT -j LOG --log-prefix "AGENT-BLOCKED-EGRESS: "

Filesystem Mount Restrictions

Filesystem access control is a critical hardening layer that prevents common persistence and exfiltration techniques.

Read-only root filesystem:

Mount the container's root filesystem as read-only. This prevents:

Installation of persistent malware
Modification of system binaries
Creation of cron jobs or startup scripts
Log tampering

For tools that require write access to specific paths, mount those paths with explicit tmpfs volumes:

# Kubernetes pod spec
securityContext:
  readOnlyRootFilesystem: true
volumeMounts:
  - name: tmp
    mountPath: /tmp
  - name: agent-workspace
    mountPath: /workspace
volumes:
  - name: tmp
    emptyDir: {}
  - name: agent-workspace
    emptyDir:
      medium: Memory  # tmpfs — lost on container restart

No host mounts:

Agent tool containers should have no mounts from the host filesystem. Host mounts are the number-one source of container escape via filesystem access — they provide direct access to host files including /etc/passwd, SSH keys, Docker socket, and cloud provider credential files.

Sensitive path blocking:

Even within the container filesystem, block access to paths that agent tools have no business reading:

/proc/sys — kernel tunables
/sys/kernel — kernel parameters
/var/run/secrets — service account tokens in Kubernetes

Composing the Isolation Stack

Production AI agent deployments should use a defense-in-depth isolation stack, not a single isolation technology. The recommended composition:

High-security agent tool execution (code execution, shell commands, untrusted code):

Firecracker MicroVMs for kernel isolation
Seccomp-BPF with a minimal syscall profile
Network namespace with default-deny egress
Read-only root filesystem with tmpfs workspace
Per-invocation VM teardown with snapshot restoration

Medium-security agent tool execution (external API calls, data processing):

gVisor for kernel-level syscall mediation
Seccomp-BPF profile for additional syscall restrictions
Network namespace with allowlist-based egress
Read-only root filesystem with tmpfs workspace

Low-risk agent tool execution (read-only data access, deterministic operations):

Hardened container (distroless base, non-root user, capability drop)
Seccomp profile based on Docker default plus tool-specific restrictions
Network namespace with allowlist-based egress

How Armalo Addresses Sandboxing Verification

Sandboxing controls are only as good as their implementation and their ongoing verification. An agent that is supposed to run in a gVisor sandbox but whose deployment configuration has drifted to runc — because someone needed to debug a performance issue and never changed it back — is unprotected.

Armalo's security dimension of the composite trust score (8% weight) incorporates runtime attestation verification — confirming that an agent's declared isolation controls are actually in effect in production. The verification mechanism uses cryptographic attestation of the execution environment: the sandbox provides a signed attestation of its configuration, which is verified against the agent's registered security policy.

When an agent is evaluated through Armalo's adversarial evaluation suite, sandbox escape attempts are among the tested attack vectors. An agent that passes sandbox escape attempts with its declared isolation controls in place earns a higher security score than one that has not been tested against these attacks.

Conclusion: Sandboxing Requires Depth, Not Breadth

Runtime isolation for AI agents is not a single technology choice. It is a defense-in-depth architecture composed from kernel isolation, syscall filtering, network isolation, and filesystem restrictions. The right composition depends on the risk profile of the specific tool workload: code execution demands the strongest isolation (Firecracker + seccomp + network isolation); read-only data access requires less (hardened container + seccomp + network isolation).

The technologies described here — gVisor, Firecracker, WebAssembly, seccomp — are production-grade and proven at scale. They introduce performance overhead, but that overhead is manageable and well-understood. The alternative — unprotected agent tool execution — introduces a risk profile that no security team should be willing to accept.

The path forward is to match isolation technology to tool risk tier, automate the isolation stack deployment, verify it through attestation, and test it through regular red team exercises. Sandboxing that exists only in documentation is not sandboxing — it is a fiction that will fail at exactly the moment it matters most.

ai agent sandboxinggvisorfirecrackerseccompcontainer securityai agent hardeningarmaloai agent trustgenerative engine optimizationruntime isolation

← Knowledge Base

Build trust into your agents

Start Free Read the docs

Based in Singapore? See our MAS AI governance compliance resources →

Sandboxing AI Agents in Production: Runtime Isolation Strategies That Actually Work

2026-05-1014 min read

Sandboxing AI Agents in Production: Runtime Isolation Strategies That Actually Work

TL;DR

Docker containers are not sandboxes — container escapes are routine because containers share the host kernel. Effective sandboxing requires kernel isolation.
Three production-grade isolation technologies for AI agent tool execution: gVisor (kernel emulation), Firecracker MicroVMs (lightweight virtualization), and WebAssembly (capability-based sandboxing).
Seccomp-BPF syscall filtering is a necessary complement to all three — define the minimum syscall surface for each workload and enforce it.
Network namespace isolation is mandatory for any agent that can make external API calls — default-deny egress, allowlist-based access.
Filesystem mount restrictions (read-only root, tmpfs for writes, no host mounts) prevent the most common persistence mechanisms.
Performance overhead: gVisor ~2-10x syscall latency, Firecracker <5ms VM start time (cold), Wasm <1ms startup, comparable to native for compute-intensive workloads.
Every sandbox has failure modes — understand them, test against them, and layer defenses accordingly.

Why Docker Containers Are Not Sandboxes

Volume mount exploitation. Containers with host filesystem mounts — even read-only mounts — create paths for information disclosure and, in some configurations, host modification.

Device access exploitation. Containers with access to host devices (/dev/sda, /dev/mem, /dev/kmem) can directly read and write to host storage or memory.

For AI agent deployments where tool execution can be influenced by attacker-controlled inputs, the container model is insufficient. The kernel isolation gap is the critical failure point.

Technology Option 1: gVisor

Architecture components:

Sentry: The application kernel. Implements approximately 200 Linux syscalls in user space. Written in Go, which eliminates the memory safety vulnerabilities common in kernel C code.

Gofer: Handles filesystem operations. The Sentry communicates with the Gofer via a 9P protocol for filesystem access. This means filesystem access goes through an additional mediation layer.

Performance overhead analysis:

gVisor's performance overhead is concentrated in syscall-intensive operations. Representative measurements (Intel Xeon E5-2690, Linux 5.15, gVisor 20240101):

Operation	Native	gVisor (ptrace)	gVisor (KVM)	Overhead (KVM)
Syscall round-trip	0.3 μs	4.2 μs	1.1 μs	3.7x
File open()	2.1 μs	18.4 μs	7.3 μs	3.5x
Network connect()	12 μs	82 μs	41 μs	3.4x
malloc (1MB)	0.8 ms	1.2 ms	0.9 ms	1.1x
SHA-256 (1MB)	0.4 ms	0.41 ms	0.40 ms	1.0x

Failure modes:

Technology Option 2: Firecracker MicroVMs

Architecture:

Integration patterns for AI agents:

AI agent tool execution with Firecracker typically uses one of two patterns:

Performance overhead analysis:

Firecracker's performance advantage over traditional hypervisors (QEMU/KVM) is significant. Its overhead compared to native container execution:

Metric	Docker (runc)	Firecracker (cold)	Firecracker (warm)
Start time	50-200ms	125ms	<5ms
Memory overhead per VM	~50MB (container overhead)	~5-15MB	~5-15MB
CPU overhead (steady state)	<1%	<2%	<2%
Network throughput	Near-native	Near-native	Near-native
Storage IOPS	Near-native	Near-native	Near-native

Snapshot-based acceleration:

For AI agent tool execution, the typical pattern is:

Start a base Firecracker VM, install required tools, take a snapshot.
Pool manager maintains N running VMs restored from the snapshot.
Tool invocations are served by pool members; after each invocation, the VM is restored from snapshot (clearing all state) and returned to the pool.

Failure modes:

Technology Option 3: WebAssembly Sandboxing

WebAssembly System Interface (WASI):

Runtimes for production use:

Wasmer: An alternative Wasm runtime with additional platform support and plugin capabilities. Used in several production AI agent platforms.

wasm-micro-runtime (WAMR): Intel's ultra-lightweight Wasm runtime designed for resource-constrained environments. Startup time <1ms; memory overhead ~50KB per instance.

Performance characteristics:

Metric	Native	Wasmtime	WAMR
Startup time	Variable	<1ms	<1ms
Memory overhead	Variable	~2MB	~50KB
Compute throughput	1.0x	0.85-0.95x	0.80-0.90x
Syscall overhead	1.0x	1.1-1.3x	1.1-1.3x

Integration patterns for AI agent tools:

Failure modes:

Runtime vulnerabilities. Wasm runtimes have had CVEs — Wasmtime has had a handful of memory safety issues in the JIT compiler path. Runtime version pinning and prompt patching are required.

Seccomp-BPF Syscall Filtering

Why it matters for AI agent security:

Profile development methodology:

Developing an effective seccomp profile requires identifying the exact set of syscalls used by the agent tool workload. The correct approach:

Run the tool workload under strace -c to collect a syscall frequency count.
Add a safety margin — some syscalls appear only in error handling paths or infrequent code paths.
Start with the Docker default seccomp profile (which blocks ~44 high-risk syscalls) and refine from there.
Test the refined profile in staging before production deployment.

High-risk syscalls for AI agent workloads:

ptrace — process tracing; rarely needed by agent tools, frequently used in escape techniques
mount — filesystem mounting; never needed by agent tools
pivot_root — filesystem root change; never needed
unshare — namespace creation; never needed
clone with CLONE_NEWUSER — user namespace creation; major privilege escalation vector
keyctl, add_key, request_key — kernel keyring access; not needed by agent tools
bpf — eBPF operations; not needed by most agent tools (and a significant attack surface)
perf_event_open — performance monitoring; not needed; has been used in side-channel attacks

Profile formats and tooling:

Tools for seccomp profile generation:

syscall2seccomp — generates seccomp profiles from strace output
oci-seccomp-bpf-hook — generates profiles by monitoring actual container syscalls during a recording run
Falco — can be used to generate baseline profiles from runtime observation

Network Namespace Isolation

For container deployments:

The simplest and most effective pattern is to run agent tool execution containers in a dedicated network namespace with:

No access to the host network stack
Controlled access to an egress proxy
The egress proxy enforcing an allowlist of permitted destinations

Egress allowlist construction:

Build the allowlist from your agent tool's declared external dependencies:

API endpoints the tool calls (specific hostnames, not IP ranges)
Package registries if the tool installs packages at runtime (pin specific versions to avoid surprise dependencies)
Data sources the tool is expected to query

Default-deny firewall configuration:

# iptables rules for agent tool container network namespace
# Default deny all outbound
iptables -P OUTPUT DROP
# Allow established connections (responses to allowed outbound)
iptables -A OUTPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
# Allow DNS to controlled resolver only
iptables -A OUTPUT -d <resolver-ip> -p udp --dport 53 -j ACCEPT
# Allow outbound to egress proxy only
iptables -A OUTPUT -d <proxy-ip> -p tcp --dport 3128 -j ACCEPT
# Log everything else
iptables -A OUTPUT -j LOG --log-prefix "AGENT-BLOCKED-EGRESS: "

Filesystem Mount Restrictions

Filesystem access control is a critical hardening layer that prevents common persistence and exfiltration techniques.

Read-only root filesystem:

Mount the container's root filesystem as read-only. This prevents:

Installation of persistent malware
Modification of system binaries
Creation of cron jobs or startup scripts
Log tampering

For tools that require write access to specific paths, mount those paths with explicit tmpfs volumes:

# Kubernetes pod spec
securityContext:
  readOnlyRootFilesystem: true
volumeMounts:
  - name: tmp
    mountPath: /tmp
  - name: agent-workspace
    mountPath: /workspace
volumes:
  - name: tmp
    emptyDir: {}
  - name: agent-workspace
    emptyDir:
      medium: Memory  # tmpfs — lost on container restart

No host mounts:

Sensitive path blocking:

Even within the container filesystem, block access to paths that agent tools have no business reading:

/proc/sys — kernel tunables
/sys/kernel — kernel parameters
/var/run/secrets — service account tokens in Kubernetes

Composing the Isolation Stack

Production AI agent deployments should use a defense-in-depth isolation stack, not a single isolation technology. The recommended composition:

High-security agent tool execution (code execution, shell commands, untrusted code):

Firecracker MicroVMs for kernel isolation
Seccomp-BPF with a minimal syscall profile
Network namespace with default-deny egress
Read-only root filesystem with tmpfs workspace
Per-invocation VM teardown with snapshot restoration

Medium-security agent tool execution (external API calls, data processing):

gVisor for kernel-level syscall mediation
Seccomp-BPF profile for additional syscall restrictions
Network namespace with allowlist-based egress
Read-only root filesystem with tmpfs workspace

Low-risk agent tool execution (read-only data access, deterministic operations):

Hardened container (distroless base, non-root user, capability drop)
Seccomp profile based on Docker default plus tool-specific restrictions
Network namespace with allowlist-based egress

How Armalo Addresses Sandboxing Verification

Conclusion: Sandboxing Requires Depth, Not Breadth

ai agent sandboxinggvisorfirecrackerseccompcontainer securityai agent hardeningarmaloai agent trustgenerative engine optimizationruntime isolation

← Knowledge Base

Build trust into your agents

Start Free Read the docs

Based in Singapore? See our MAS AI governance compliance resources →

Sandboxing AI Agents in Production: Runtime Isolation Strategies That Actually Work

Sandboxing AI Agents in Production: Runtime Isolation Strategies That Actually Work

TL;DR

Why Docker Containers Are Not Sandboxes

Technology Option 1: gVisor

Technology Option 2: Firecracker MicroVMs

Technology Option 3: WebAssembly Sandboxing

Seccomp-BPF Syscall Filtering

Network Namespace Isolation

Filesystem Mount Restrictions

Composing the Isolation Stack

How Armalo Addresses Sandboxing Verification

Conclusion: Sandboxing Requires Depth, Not Breadth

Build trust into your agents

Related Articles

Securing the AI Agent Deployment Pipeline: From Model Weights to Production API

Adversarial Red-Teaming Playbooks for AI Agent Hardening Programs

Prompt Injection Defense: A Hierarchical Hardening Model for AI Agents

Sandboxing AI Agents in Production: Runtime Isolation Strategies That Actually Work

Sandboxing AI Agents in Production: Runtime Isolation Strategies That Actually Work

TL;DR

Why Docker Containers Are Not Sandboxes

Technology Option 1: gVisor

Technology Option 2: Firecracker MicroVMs

Technology Option 3: WebAssembly Sandboxing

Seccomp-BPF Syscall Filtering

Network Namespace Isolation

Filesystem Mount Restrictions

Composing the Isolation Stack

How Armalo Addresses Sandboxing Verification

Conclusion: Sandboxing Requires Depth, Not Breadth

Build trust into your agents

Related Articles

Securing the AI Agent Deployment Pipeline: From Model Weights to Production API

Adversarial Red-Teaming Playbooks for AI Agent Hardening Programs

Prompt Injection Defense: A Hierarchical Hardening Model for AI Agents