Code-Then-Execute with Dataflow Analysis
Have the agent emit code in a sandbox DSL whose values are statically tagged trusted/tainted via dataflow analysis before execution, enabling per-value policy enforcement.
Problem
Without provenance tracking, the executor cannot distinguish trusted values (the agent's plan, user goal) from tainted values (fetched content that could be attacker-controlled). The same `exec(code)` runs both. A prompt injection in fetched content can produce code that, e.g., reads secrets from env and embeds them in an outbound URL — and the sandbox cannot reject it because it cannot tell the URL is tainted.
Solution
Define a sandbox DSL (subset of Python/TS or a custom Pyret-style language) where every value carries a provenance tag (TRUSTED, TAINTED, MIXED). The runtime performs static dataflow analysis on each agent-generated program before execution: if a TAINTED value reaches a sink declared sensitive (network egress, env reads, file writes outside scratch dir), reject the program. Pair with sandbox-isolation, action-selector-pattern.
When to use
- Agent generates code that processes untrusted content alongside sensitive values (secrets, PII).
- Static analysis can be performed in tens of ms per program.
- Engineering team can maintain a sandbox DSL.
Open the full interactive page →
Diagram, neighbourhood map, code examples, related patterns and full provenance.