Dry-Run Harness
also known as Action Preview Harness, Side-Effect Diff Preview
Simulate planned actions (and their projected side effects) without committing them, surfacing a reviewable diff before any commit.
This pattern helps complete certain larger patterns —
- specialisesSimulate Before Actuate★— Before issuing an irreversible action, run a deterministic simulation that computes pre-conditions, invariants, and expected deltas; require a verifier — automated or human — to green-light the simulated outcome before the real command is sent.
Context
An agent plans a sequence of actions that will mutate external state (database writes, API calls, file edits, infrastructure changes). The team wants to keep human-in-the-loop for risky actions, but reviewing every step is too costly.
Problem
Reviewing each individual action lacks context — humans need to see the projected end-state, not isolated steps. Naive simulate-before-actuate runs only the next action in dry-run; humans cannot evaluate the aggregate effect of a multi-step plan. Differs from simulate-before-actuate by presenting the candidate side-effect set as a unified reviewable artifact.
Forces
- Per-step review imposes prohibitive cognitive load on humans.
- Whole-plan simulation requires modeling all side-effects, which may be impossible for some tools.
- Dry-run results must be faithful to what real execution would do — otherwise the review is misleading.
Example
An infrastructure agent plans 'migrate cluster A to region B'. Dry-run produces: 'will create 12 EC2 instances ($2.4k/month), modify 3 security groups, drain 200 connections from cluster A, run 4 DNS updates'. Human reviews the aggregated diff in one screen, approves, and commit phase fires. Without dry-run, the agent would have made all 19 changes individually with no chance for aggregate review.
Diagram
Solution
Therefore:
Build a tool wrapper that supports dry-run mode: every action returns the projected side-effect (the SQL it would run, the API call it would make, the file diff it would write) without actually committing. The agent runs end-to-end in dry-run; the resulting collection of projected side-effects is presented to a human as a unified diff (or change-list). Human approves, edits, or rejects the plan as a whole. Only on approval do the actions commit for real. Pair with approval-queue, simulate-before-actuate, human-in-the-loop.
What this pattern forbids. No real side-effect commits until the dry-run diff is approved as a unit; tools must implement dry-run faithfully or be excluded from dry-run-eligible plans.
And the patterns that stand alongside it, or against it —
- complementsApproval Queue★★— Queue agent-proposed actions for asynchronous human review while the agent continues other work.
- complementsHuman-in-the-Loop★★— Require explicit human approval at defined points before the agent performs an action.
- complementsMental-Model-In-The-Loop Simulator·— Run candidate multi-step strategies inside an internal simulator of the environment before committing in the real world — broader than simulate-before-actuate (single action) by simulating multi-step strategies.
- complementsCompensating Action★★— Pair every irreversible-looking agent action with a compensating action that can undo or counteract it.
- complementsSynchronous Execution-Plan Confirmation★— Agent synchronously emits its full execution plan for user confirmation before any side-effect step, and provides asynchronous operation recordings for post-hoc review.
Neighbourhood
Click any neighbour to follow the language. Scroll to zoom, drag to pan.