XIV · Anti-PatternsAnti-pattern

Replay Divergence

also known as Replay-Time Output Drift, Non-Deterministic Event Replay

Anti-pattern: treat an append-only event log whose consumers are LLMs as deterministically replayable, so replaying it under a changed model or prompt reconstructs different downstream events than the original run.

Context

A system records agent activity as an append-only event log and treats replay as a first-class capability — to recover state after a crash, to re-derive an audit trail, to branch a past run for debugging, or to reprocess history under an upgraded model. Event sourcing's contract is that replaying the log reconstructs the same state, and the team relies on that determinism. Some consumers of the log are LLM calls.

Problem

An LLM call is not a pure function of its inputs: the same event replayed under a newer model version, a changed prompt template, or even nominally identical sampling settings can emit a different downstream event than the first run produced. When the replayed output feeds the next step, the divergence compounds — a tool is called with arguments the original never generated, a branch is taken that never happened, and the reconstructed state no longer matches what actually occurred. Nothing errors, because each replayed call is individually well-formed, so the log silently stops being a faithful record. Recovery then restores a state the system was never in, an audit replay yields a different decision than the customer received, and a debugging branch diverges from the very trace it was meant to reproduce.

Forces

  • Event sourcing and durable execution assume replay is deterministic, but an LLM consumer breaks that assumption the moment the model or prompt changes.
  • Replaying to re-derive under a new model is sometimes the goal, so journaling the original output defeats that purpose and cannot be the only answer.
  • Each replayed call is individually valid, so the divergence raises no error and surfaces only as corrupted downstream state.
  • Pinning the model and every sampling input keeps replay faithful but freezes the system on an old model and grows the journal without bound.

Example

A support-automation team event-sources every agent run so they can replay logs to recover state after a crash. After upgrading the underlying model, an outage forces a replay of the day's log, and where the old model had routed a refund to manual review the new model approves it outright. The replay reconstructs a state in which refunds were issued that never were, each replayed step looks valid, and nobody notices until the books fail to reconcile.

Diagram

Solution

Therefore:

Separate the two reasons to replay and handle each explicitly. For faithful recovery and audit, record each non-deterministic step's output on first execution and replay the recorded value instead of re-invoking the model, and stamp every event with the model version and prompt hash that produced it. For deliberate re-derivation under a new model, treat the replay as a fresh run rather than a reconstruction: diff its events against the original, surface every divergence, and gate any changed decision behind review. Measure how reproducible the agent actually is and require the strictest determinism tier for events that drive regulated or irreversible actions. Never let a replay whose model or prompt has changed overwrite recovered state as if it were the original.

What this pattern forbids. An LLM-consumed event log must not be assumed to replay deterministically; replay for recovery may not re-invoke the model but must use journaled outputs, and a replay whose model or prompt has changed cannot overwrite reconstructed state as if it were the original run.

The patterns that counter or replace it —

  • complementsJournaled LLM CallRecord the output of every non-deterministic step on first execution and replay that recorded value during crash-recovery instead of re-invoking the model.
  • complementsDeterminism-Tiered Replay Gate·Classify an agent into a reproducibility tier by re-running identical inputs, require the strictest decision-determinism tier for regulated decisions, and gate deployment and validation-sample size on the measured tier.
  • complementsReplay / Time-Travel★★Re-run a past agent trace from any step with modified inputs/prompts/tools to debug or branch.
  • complementsConfident InconsistencyAnti-pattern: in a regulated workflow the same query produces materially different outputs at different times, each looking correct and passing review, so the variance stays invisible unless outputs are deliberately re-run and compared across time.
  • complementsStochastic-Deterministic Boundary (SDB)Formalize the seam between an LLM proposal and a system action as a four-part contract — proposer, verifier, commit step, reject signal — so the contract itself, not the agent's good intent, gates side-effects.

Neighbourhood

Click any neighbour to follow the language. Scroll to zoom, drag to pan.