Replay / Time-Travel
also known as Trace Replay, Run Branching, Fork from Step N
Re-run a past agent trace from any step with modified inputs/prompts/tools to debug or branch.
Context
A team supports an agent in production where users occasionally hit weird, hard-to-reproduce behaviour: a strange reply, an unexpected tool call, a wrong answer on an input that worked yesterday. Engineers want to load the exact past run, jump to a specific step, swap in a different prompt or model, and see whether the alternative would have done better. The system already captures per-step inputs, outputs, prompts, model identifiers, and tool calls in a trace store.
Problem
Agent runs depend on non-deterministic model outputs, accumulated conversation state, and external tool results that may not be the same on the next call. Trying to reproduce a three-day-old bug locally usually fails because too much has changed, and engineers end up debugging by re-running the user's prompt and hoping the model behaves the same way. The team is forced to choose between spending hours on guess-and-check reproduction or shrugging off intermittent bugs that they cannot deterministically trigger.
Forces
- Captured state must be complete enough to re-run.
- Storage of full traces is expensive.
- Modified replays diverge from original; comparison logic is non-trivial.
Example
A support agent gives a strange reply to a user three days ago and the team cannot reproduce it locally because too much state has changed. They open the trace store, jump to step 7, swap in the new system prompt, and re-run forward; the new prompt fixes the issue, the old one reproduces it exactly. They commit the fix with the trace ID in the changelog. Replay turns 'this happened once' bugs into deterministic tests.
Diagram
Solution
Therefore:
Capture per-step inputs, outputs, prompts, model id, tool calls. Provide a replay tool that loads a trace at step N and re-runs forward with optional modifications (different model, different prompt, different tool result). Store branches for comparison.
What this pattern forbids. Replay reads from captured state; live model and tool calls happen only for the modified branch from step N forward.
The smaller patterns that complete this one —
- usesDecision Log★★— Persist the agent's reasoning trace alongside its actions so post-hoc review can explain why.
And the patterns that stand alongside it, or against it —
- complementsLineage Tracking★★— Track which prompt version, model version, and data sources produced each agent output.
- complementsDurable Workflow Snapshot★— Capture workflow execution state as a snapshot in a pluggable storage provider so a paused run can resume across deployments, process restarts, and host crashes.
Neighbourhood
Click any neighbour to follow the language. Scroll to zoom, drag to pan.