Safety & Control

Trajectory Anomaly Monitor

Run a trained, non-LLM verifier out-of-band over the agent's action trajectory at runtime to flag task-misaligned plans and malformed step sequences at millisecond latency, before the actions cause damage.

Problem

Per-step oversight by an LLM judge adds latency and cost that production cannot absorb on every action, and scoring an agent's final output reveals nothing about a dangerous action mid-trajectory until it is too late. What is missing is a check that reads the whole action sequence as it unfolds — recognising that a plan has drifted off the task or that the step structure is malformed — and does so fast enough to intervene before the next action lands. Output-quality monitors are not sequence-aware, and loop-shape heuristics catch only repetition, not subtler misalignment.

Solution

Train a dedicated verifier — a sequence model or a process-supervised classifier, not an LLM judge — on agent trajectories labelled for task alignment and structural validity. At runtime it consumes the agent's action sequence out-of-band and emits an anomaly signal at millisecond latency, fast enough to gate or pause the agent before the next action executes. Reported results put such a verifier at tens of milliseconds per check, well over an order of magnitude faster than an LLM-judge baseline, with process supervision over the trajectory outperforming output-only checks. Compose with a policy gate that halts or escalates on a flagged trajectory, and reserve LLM-judge review for the flagged cases rather than every step. Distinct from scoring final outputs and from loop-shape heuristics: the unit is the whole action sequence, and the timing is pre-damage.

When to use

  • An agent takes consequential actions in sequence where a misaligned trajectory can cause damage.
  • Per-step LLM-judge oversight is too slow or costly for the production hot path.
  • Enough labelled trajectory data exists to train and maintain a verifier.

Open the full interactive page

Diagram, neighbourhood map, code examples, related patterns and full provenance.

Related