Mental-Model-In-The-Loop Simulator
also known as Internal Simulator, Strategy-Test-In-Mental-Model
Run candidate multi-step strategies inside an internal simulator of the environment before committing in the real world — broader than simulate-before-actuate (single action) by simulating multi-step strategies.
This pattern helps complete certain larger patterns —
- specialisesSimulate Before Actuate★— Before issuing an irreversible action, run a deterministic simulation that computes pre-conditions, invariants, and expected deltas; require a verifier — automated or human — to green-light the simulated outcome before the real command is sent.
Context
A team has an agent that must commit to multi-step strategies with real-world consequences (trading, infrastructure changes, treatment plans). simulate-before-actuate covers per-action preview; this pattern covers per-strategy preview where multiple steps interact.
Problem
Per-action preview misses strategy-level interactions: step 2's safety depends on step 1's outcome, which the per-action check cannot see. A strategy that looks fine action-by-action can be disastrous in aggregate. Without a strategy simulator, the agent commits to multi-step strategies blind to their joint effect.
Forces
- Simulators must model the environment accurately enough to be useful.
- Simulation latency adds to per-strategy decision time.
- Some real-world effects cannot be simulated (external systems, human behavior).
Example
A trading agent considers a 5-step strategy: [sell A, buy B, hedge C, wait T, rebalance]. Simulator runs the strategy against a market state model. Simulated outcome: 90% of paths see acceptable P&L, but 10% trigger margin call at step 4. Strategy revised before any real trade fires. simulate-before-actuate would have approved each individual trade.
Diagram
Solution
Therefore:
Maintain a simulator of the relevant environment slice — could be a learned world model, a deterministic state machine, a what-if engine. Before committing to a strategy, run it in the simulator and score the simulated outcome. Reject strategies that simulate to bad outcomes. Pair with simulate-before-actuate (single-action), dry-run-harness (whole-plan preview), world-model-as-tool, world-model-graph-memory.
What this pattern forbids. No multi-step strategy commits without simulator scoring; simulator scope is declared and limited (does not claim to simulate what it cannot).
And the patterns that stand alongside it, or against it —
- complementsDry-Run Harness★— Simulate planned actions (and their projected side effects) without committing them, surfacing a reviewable diff before any commit.
- complementsWorld Model as Tool·— Let a planning agent invoke a generative world model as a tool to roll out hypothetical futures before committing to an action, treating the world model as a callable simulator rather than a training target.
- complementsWorld-Model Graph Memory★— Memory store structured as a typed entity-relation graph used as the agent's authoritative world model for planning — not only for retrieval.
- complementsPlanner-Executor-Verifier (PEV)★— Triadic specialization where a planner produces the plan, an executor runs it, and a separate verifier checks each step's effects against the original goal.
Neighbourhood
Click any neighbour to follow the language. Scroll to zoom, drag to pan.