Safety & Control

Control-Flow Integrity

Treat the agent's planned step sequence as a trusted control-flow graph that tool outputs, retrieved content, and user-supplied data cannot redirect at runtime.

Problem

Classical software keeps data and instructions in separate memory regions because allowing data to be executed is the canonical exploit primitive. LLM agents have no such separation by default: a tool output, a retrieved document, or a fetched page returns tokens that flow back into the model's context, and the model can decide to add new steps, skip steps, or call tools the original plan never authorised. Each turn of the loop is a fresh chance for embedded instructions to alter what runs next, and there is no architectural fact that says the plan is the authority. Prompt-injection-defense filters the inputs and tool-output-trusted-verbatim guards how outputs are consumed, but neither pins down the structural commitment that the plan itself decides the next edge.

Solution

Lift control flow out of the model's free-form reasoning into an explicit artefact the host enforces. Concrete moves: compile the plan to a static DAG or finite state machine before execution begins; let nodes consume tool outputs as typed values but forbid those outputs from adding nodes or editing edges; route any genuine replan through a separate, privileged planner that re-emits a new compiled graph rather than mutating the current one in place; treat every step's predecessor as evidence the host can check, so an execution trace has a provable origin in the original plan. The model is the consumer of the graph, not its author at runtime.

When to use

  • The agent operates on a plan-then-execute architecture or any runtime where steps could be lifted into an explicit graph.
  • Tool outputs or retrieved content come from sources the operator does not control and could carry injection payloads.
  • The cost of an unauthorised tool call (data write, payment, exfiltration) is high enough to justify pre-compiling a plan.

Open the full interactive page

Diagram, neighbourhood map, code examples, related patterns and full provenance.

Related