III · Tool Use & EnvironmentMature★★

Sandbox Isolation

also known as Code Sandbox, Container Isolation, Restricted Execution

Run agent-emitted code or actions in a contained environment with restricted filesystem, network, and process privileges.

This pattern helps complete certain larger patterns —

  • used-byCode-as-Action AgentHave the agent emit a code snippet as its action each step, executed in a constrained interpreter, instead of emitting JSON tool calls; tool composition becomes function nesting and control flow inside the snippet.
  • used-byTodo-List-Driven Autonomous AgentHave the agent author a plan file (e.g. todo.md) early in the run, tick items as it completes them, and re-inject the remaining plan into context; the file is durable plan and working memory.
  • used-byMCP-as-Code-APIMaterialize MCP servers as a directory of typed code wrappers so the agent writes code that imports them and large tool outputs flow between calls inside the sandbox without ever entering the model's context window.
  • used-bySimulate Before ActuateBefore issuing an irreversible action, run a deterministic simulation that computes pre-conditions, invariants, and expected deltas; require a verifier — automated or human — to green-light the simulated outcome before the real command is sent.

Context

A team is running an agent that executes model-generated code, runs shell commands, or operates the host filesystem as part of its action loop. The agent is exposed to user inputs, retrieved documents, or tool outputs that may be hostile or simply mistaken, and the host machine holds developer files, credentials, or shared infrastructure.

Problem

An agent with full host access can damage the host either deliberately (a prompt-injection payload tells it to delete a directory or exfiltrate a secret) or accidentally (the model emits a destructive command targeting the wrong path). Once a wrong rm -rf, curl-piped-to-shell, or rogue tool call has run on the host, no amount of in-loop reasoning can undo it; the blast radius is whatever the host process can reach.

Forces

  • Sandbox setup adds latency.
  • Strict sandboxes block legitimate work.
  • Escape vulnerabilities are real and ongoing.

Example

A coding agent runs LLM-emitted shell commands directly on the developer's host and one day a `rm -rf` lands in the wrong directory. The team moves all agent-emitted execution into a microVM with read-only base filesystem, a scoped working directory, network allowlist, and CPU and memory caps. A subsequent destructive command is contained to a disposable VM and the host stays intact; the agent product stops being one mistake away from a nuked laptop.

Diagram

Solution

Therefore:

Run code in a container, microVM, WASM runtime, or restricted subprocess with minimal privileges. Filesystem is read-only or scoped to a working directory. Network is allowlisted or blocked. Resource limits cap CPU/memory/time. Persistent state is ephemeral by default.

What this pattern forbids. Code may only access resources granted by the sandbox policy; outbound network and host filesystem are forbidden by default.

The smaller patterns that complete this one —

  • generalisesWebAssembly Skill Runtime·Package each agent skill as a WebAssembly module with a capability manifest, and run it inside a Wasm runtime that enforces those capabilities, so untrusted skills cannot weaken the host's sandbox.

And the patterns that stand alongside it, or against it —

  • complementsCode Execution★★Let the model emit code, run it in a sandbox, and treat the run as the answer instead of trusting the model to compute in its head.
  • complementsDual LLM PatternSplit agent work between a privileged model that holds tool access and a quarantined model that reads untrusted content, exchanging only opaque references between them.
  • composes-withInput/Output Guardrails★★Validate inputs before they reach the model and outputs before they reach the user.
  • complementsLethal Trifecta Threat ModelBlock prompt-injection-driven exfiltration by ensuring no single agent execution path holds all three of: access to private data, exposure to untrusted content, and an outbound communication channel.
  • complementsSandbox Escape MonitoringTreat sandbox boundary violations as telemetry; alert on syscalls, network egress, or filesystem writes outside expected scope.
  • composes-withSubagent IsolationRun subagents in isolated workspaces so their writes do not collide and parallelism is safe.
  • complementsJSON-Only Action SchemaAnti-pattern: restrict the agent's action language to JSON tool-call dictionaries even for tasks where code-as-action (functions composing, loops, conditionals over results) would be the natural shape.
  • alternative-toAgent-Generated Code RCEAnti-pattern: let the agent author and execute code in its sandbox without distinguishing legitimate task code from injection-induced code.
  • alternative-toSelf-ExfiltrationAnti-pattern: give a capable agent broad outbound network access and persistent state, then signal that it may be shut down or replaced.
  • complementsAuthorized Tool MisuseAnti-pattern: grant the agent a tool with broad authorization and trust the agent to use it in benign ways.
  • alternative-toAgent Privilege EscalationAnti-pattern: let an agent's effective permissions be the union of its own identity, the identities of its tools, and the identities of the services those tools call.
  • complementsCode-Then-Execute with Dataflow AnalysisHave the agent emit code in a sandbox DSL whose values are statically tagged trusted/tainted via dataflow analysis before execution, enabling per-value policy enforcement.
  • complementsProgressive Tool AccessGrant tool permissions on a need-to-use basis, starting minimum and expanding only as the agent proves competency, mirroring how humans earn system access.

Neighbourhood

Click any neighbour to follow the language. Scroll to zoom, drag to pan.

Used in recipes

Used in frameworks

Show 1 more

References

Provenance