Simulate Before Actuate
also known as Dry-Run Harness, Simulate-Then-Commit, Pre-Action Simulation Gate
Before issuing an irreversible action, run a deterministic simulation that computes pre-conditions, invariants, and expected deltas; require a verifier — automated or human — to green-light the simulated outcome before the real command is sent.
Context
An agent has tools that take irreversible actions: filesystem writes, database mutations, infrastructure changes, browser actions on a live site, payments, emails. The cost of a wrong action is high. The agent itself is non-deterministic and occasionally proposes plausible-looking actions that are wrong in subtle ways: deletes the wrong key, sends to the wrong recipient, mutates the wrong row.
Problem
Letting the agent commit irreversible actions on a single proposal exposes the system to silent, hard-to-rollback damage. Pure human-in-the-loop is too slow for the volume; pure trust-the-agent is too dangerous. Recent practitioner write-ups (Joakim Vivas' '17 agentic architectures' survey) and the arXiv 'Architectures for Building Agentic the model' chapter and 'Deterministic Pre-Action Authorization' preprint converge on a deterministic simulation step: run the proposed action against a digital twin, sandbox replay, or dry-run flag; compute the resulting state and the diff; require sign-off on the diff before committing.
Forces
- Irreversible actions deserve more scrutiny than reversible ones, but the agent's proposal does not distinguish.
- Full human-in-the-loop is too slow at production volume; a deterministic verifier can scale.
- A simulation has to be faithful enough that 'passes the sim' implies 'safe in reality' — otherwise the gate is theatre.
- Some action surfaces have no simulator (external APIs without sandboxes, partner systems); the pattern then degrades to dry-run flags, schema validation, or HITL.
Example
A devops agent receives a request to clean up unused Kubernetes resources. It proposes 'kubectl delete pod app-prod-7d3'. The wrapper intercepts the call, runs it with --dry-run=server, reads the simulated diff: 'will delete 1 pod, will scale Deployment app-prod from 3 to 2, will not affect Service'. The verifier checks invariants: target namespace is in the agent's allowed scope, deletion count is under cap, no destructive label match. All green; the real call goes out. On a different invocation the agent proposes deleting a pod in kube-system; same flow, the verifier rejects (namespace not in allowed scope), the agent gets an error back and replans.
Diagram
Solution
Therefore:
Decompose the action surface: for each irreversible tool, define a faithful simulator (digital twin, sandbox replay, dry-run mode, snapshot DOM for web, transactional rollback for DBs). Wrap the tool so every call runs simulation → verifier → execute. The verifier is automated where the invariants can be encoded (no destructive deletes without explicit flag, no out-of-budget transfers) and falls back to human-in-the-loop where they cannot. Where no simulator exists, refuse to call without HITL approval.
What this pattern forbids. Forbids the agent from invoking irreversible tools directly; every such call must pass through the simulator + verifier gate. The LLM's tool-call freedom is conditional on the gate's approval.
The smaller patterns that complete this one —
- usesSandbox Isolation★★— Run agent-emitted code or actions in a contained environment with restricted filesystem, network, and process privileges.
- generalisesDry-Run Harness★— Simulate planned actions (and their projected side effects) without committing them, surfacing a reviewable diff before any commit.
- generalisesMental-Model-In-The-Loop Simulator·— Run candidate multi-step strategies inside an internal simulator of the environment before committing in the real world — broader than simulate-before-actuate (single action) by simulating multi-step strategies.
And the patterns that stand alongside it, or against it —
- complementsHuman-in-the-Loop★★— Require explicit human approval at defined points before the agent performs an action.
- complementsWorld Model as Tool·— Let a planning agent invoke a generative world model as a tool to roll out hypothetical futures before committing to an action, treating the world model as a callable simulator rather than a training target.
- complementsApproval Queue★★— Queue agent-proposed actions for asynchronous human review while the agent continues other work.
- alternative-toCompensating Action★★— Pair every irreversible-looking agent action with a compensating action that can undo or counteract it.
- complementsPolicy-as-Code Gate★— Evaluate every proposed agent action against externally-managed machine-readable policies before dispatch, so compliance authorship lives outside the prompt and outside the agent code.
- complementsKill Switch★— Provide an out-of-band control plane to halt running agent instances without redeploy.
- complementsBlind Grader with Isolated Context★— Run an evaluator in a separately-allocated context window with access only to the artifact and the rubric, never the producing agent's reasoning trace, so the grader cannot be primed by the producer's framing.
- composes-withControl-Flow Integrity★— Treat the agent's planned step sequence as a trusted control-flow graph that tool outputs, retrieved content, and user-supplied data cannot redirect at runtime.
Neighbourhood
Click any neighbour to follow the language. Scroll to zoom, drag to pan.