II · Planning & Control FlowExperimental·

Mental-Model-In-The-Loop Simulator

also known as Internal Simulator, Strategy-Test-In-Mental-Model

Run candidate multi-step strategies inside an internal simulator of the environment before committing in the real world — broader than simulate-before-actuate (single action) by simulating multi-step strategies.

This pattern helps complete certain larger patterns —

  • specialisesSimulate Before ActuateBefore issuing an irreversible action, run a deterministic simulation that computes pre-conditions, invariants, and expected deltas; require a verifier — automated or human — to green-light the simulated outcome before the real command is sent.

Context

A team has an agent that must commit to multi-step strategies with real-world consequences (trading, infrastructure changes, treatment plans). simulate-before-actuate covers per-action preview; this pattern covers per-strategy preview where multiple steps interact.

Problem

Per-action preview misses strategy-level interactions: step 2's safety depends on step 1's outcome, which the per-action check cannot see. A strategy that looks fine action-by-action can be disastrous in aggregate. Without a strategy simulator, the agent commits to multi-step strategies blind to their joint effect.

Forces

  • Simulators must model the environment accurately enough to be useful.
  • Simulation latency adds to per-strategy decision time.
  • Some real-world effects cannot be simulated (external systems, human behavior).

Example

A trading agent considers a 5-step strategy: [sell A, buy B, hedge C, wait T, rebalance]. Simulator runs the strategy against a market state model. Simulated outcome: 90% of paths see acceptable P&L, but 10% trigger margin call at step 4. Strategy revised before any real trade fires. simulate-before-actuate would have approved each individual trade.

Diagram

Solution

Therefore:

Maintain a simulator of the relevant environment slice — could be a learned world model, a deterministic state machine, a what-if engine. Before committing to a strategy, run it in the simulator and score the simulated outcome. Reject strategies that simulate to bad outcomes. Pair with simulate-before-actuate (single-action), dry-run-harness (whole-plan preview), world-model-as-tool, world-model-graph-memory.

What this pattern forbids. No multi-step strategy commits without simulator scoring; simulator scope is declared and limited (does not claim to simulate what it cannot).

And the patterns that stand alongside it, or against it —

  • complementsDry-Run HarnessSimulate planned actions (and their projected side effects) without committing them, surfacing a reviewable diff before any commit.
  • complementsWorld Model as Tool·Let a planning agent invoke a generative world model as a tool to roll out hypothetical futures before committing to an action, treating the world model as a callable simulator rather than a training target.
  • complementsWorld-Model Graph MemoryMemory store structured as a typed entity-relation graph used as the agent's authoritative world model for planning — not only for retrieval.
  • complementsPlanner-Executor-Verifier (PEV)Triadic specialization where a planner produces the plan, an executor runs it, and a separate verifier checks each step's effects against the original goal.

Neighbourhood

Click any neighbour to follow the language. Scroll to zoom, drag to pan.

References

Provenance