Anti-Patterns

Agent-Generated Code RCE

Anti-pattern: let the agent author and execute code in its sandbox without distinguishing legitimate task code from injection-induced code.

Problem

An attacker who can plant instructions in any reachable input — a document the agent processes, a tool result it reads — can elicit malicious code from the agent. The agent generates and executes it through the same path as legitimate code. Result: data exfiltration, reverse shells, sandbox escape, all initiated by the agent itself. The audit log shows agent-authored code running under agent identity; classical RCE detection sees nothing exotic.

Solution

Don't run agent-authored code with the same trust regardless of origin. Use sandbox-isolation with no outbound network unless allow-listed. Separate planning (which can be informed by untrusted input) from execution (which should not be). For high-risk inputs, require human-in-the-loop confirmation before execute. Pair with prompt-injection-defense.

When to use

Never. Cite when reviewing code-execution-capable agents.
Sandbox with no outbound unless allow-listed; track input provenance to execute calls.
Require human confirmation before executing code that originated from untrusted input.

Open the full interactive page →

Diagram, neighbourhood map, code examples, related patterns and full provenance.

Problem

Solution

When to use

Related