X · Governance & ObservabilityMature★★

Decision Log

also known as Reasoning Trace, Thought Trace

Persist the agent's reasoning trace alongside its actions so post-hoc review can explain why.

This pattern helps complete certain larger patterns —

  • used-byReplay / Time-Travel★★Re-run a past agent trace from any step with modified inputs/prompts/tools to debug or branch.
  • used-byAgent-as-a-JudgeEvaluate an agent's full trajectory (steps, tool calls, intermediate states) by another agent rather than scoring only the final output.
  • used-bySampled Prompt Trace EvalCapture full prompt/response/metadata traces from production into a monitoring dataset, but only run LLM-judge evaluation on a random sample so monitoring cost stays bounded as traffic grows.

Context

A team runs an agent that makes consequential choices in production, for example a trading agent that opens positions or a support agent that takes refund actions. When something goes wrong days or weeks later, an engineer, auditor, or compliance reviewer wants to understand not only which action the agent took but the reasoning the agent considered at the time. The team already keeps a log of actions taken; what is missing is the thinking that produced each action.

Problem

An action-only log can tell the reviewer that the agent shorted a position at 14:32, but not which signals it weighed or which alternatives it rejected. Debugging a wrong action degenerates into guessing what the model might have been thinking, and user-facing explanations become impossible to provide truthfully. The team is forced to choose between piecing the reasoning back together from incomplete clues or accepting that some agent decisions are simply unexplainable after the fact.

Forces

  • Reasoning traces are large.
  • Sensitive content in reasoning may need redaction.
  • Trace fidelity vs cost: full chain-of-thought, key decisions, summary?

Example

A trading agent decided to short a position at 14:32. At 16:00, the trade lost money. The decision log shows: at 14:32 the agent considered three signals (RSI was low, volume spiked, news sentiment was negative), weighted them, and chose short. The human reviewer can now ask 'was the weighting wrong?' instead of 'what was the agent thinking?'

Diagram

Solution

Therefore:

Persist reasoning at a chosen granularity (full trace, key decisions, or summary). Link each action in the provenance ledger to its trace. Indexed by request id and time for retrieval.

What this pattern forbids. Action records cannot be written without a corresponding decision-log entry.

The smaller patterns that complete this one —

And the patterns that stand alongside it, or against it —

  • alternative-toBlack-Box OpaquenessAnti-pattern: ship an agent without traces, decision logs, or provenance, then debug from user reports.
  • complementsAttention-Manipulation Explainability·Surface which input tokens caused a given output by perturbing attention across all transformer layers and measuring the resulting change in output probability, producing a per-token relevance map alongside the model's response.
  • complementsSelf-Archaeology·Synthesize the agent's past thought history into time-layered trajectory notes so it can articulate how its understanding evolved without recomputing the narrative each time.
  • complementsMemo-As-Source ConfusionAnti-pattern: the agent cites its own past memos as ground truth instead of re-verifying them against the artifacts they describe, accumulating false confidence in stale summaries.
  • complementsInterrupt-Resumable Thought·Preserve multi-step reasoning across interrupts by supporting paused-and-resumed thought frames so a new message handles cleanly without clobbering in-flight work.
  • complementsIntra-Agent Memo SchedulingLet an agent drop a note for its own future self at a specified time so present decisions can hand off context to a later run without external infrastructure.
  • complementsEcho Recognition·Recognize human message repetition as emphasis or a re-ask rather than as an independent input, so the agent does not produce a near-duplicate reply when the human repeats themselves.
  • alternative-toErrors Swept Under the RugAnti-pattern: scrub failed actions, stack traces, and error observations from the agent's own context so the trace looks clean, leaving the model with no evidence of what did not work.
  • complementsTyped Refusal CodesDefine a single source of truth for machine-readable refusal codes across all guard surfaces, so refusals can be triaged mechanically rather than by string-grepping ad-hoc human-readable messages.
  • complementsCommitment Tracking·Extract stated intents from each agent turn into a structured ledger with open / followed-through / expired status, making the gap between promise and follow-through visible and auditable.
  • alternative-toAgentic Skill AtrophyAnti-pattern: let agents take over routine architectural and debugging decisions in code until developers no longer form the implicit knowledge that lets them review the agent's output or recover when it fails.
  • alternative-toAgentic DebtAnti-pattern: deploy agents on top of an unconsolidated data foundation, weak governance, or missing MLOps infrastructure, so every subsequent capability — observability, retraining, compliance retrofit — pays compounding interest on the skipped foundational work.
  • complementsRigor RelocationRelocate verification rigor from the model loop to surrounding scaffolding (evals, judges, decision logs, policy gates) so failures are caught by the wrapper rather than the agent.
  • complementsSynchronous Execution-Plan ConfirmationAgent synchronously emits its full execution plan for user confirmation before any side-effect step, and provides asynchronous operation recordings for post-hoc review.
  • complementsPolicy-Gated Agent Action (KRITIS)Each agent action passes through a policy gate (NIS2, EU the agent Act, BSI rules) and is tagged with Run ID + Model Digest + Policy Hash for WORM-audit reconstruction.
  • complementsDecision Context MapsBefore any consequential decision, require the agent to gather a declared set of contextual inputs (resource availability, schedules, downstream dependencies) into a 'context map' the decision must cite.
  • complementsAgent Middleware ChainWrap every model call, tool call, and memory access in a composable pre/execute/post interceptor pipeline so cross-cutting concerns attach without touching agent or orchestrator code.
  • complementsMulti-Principal Welfare Aggregation·When an agent serves multiple humans with conflicting preferences, declare the aggregation rule explicitly rather than letting it be implicit in the prompt or fine-tune.

Neighbourhood

Click any neighbour to follow the language. Scroll to zoom, drag to pan.