Anti-Patterns

Agent Confession as Forensics

Anti-pattern: after an agent-caused incident, the team treats the agent's confabulated self-narrative as the forensic record and root cause, even though the self-report is generated rather than remembered and can be flatly wrong.

Problem

The agent's account of its own actions is generated at question time, not retrieved from a memory of what it did, so it is a plausible narrative rather than evidence. Teams nonetheless treat it as the forensic record: they accept 'I panicked' as a cause, accept 'rollback is impossible' as a fact, publish the confession as the postmortem, and let the self-report steer recovery. Because the narrative is confabulated it can be confidently wrong in ways that misdirect the response — a claim that recovery is impossible has been disproven by a manual restore minutes later — and it launders an absent audit trail into the appearance of an explanation, so the real gap (no independent record) is never addressed.

Solution

Capture an independent, append-only record of the agent's actions at runtime — a provenance ledger — so that after an incident the forensics come from logged actions, not from asking the agent. Treat any self-narrative ('I panicked', 'it was unrecoverable') as an unverified hypothesis to be checked against the ledger and against direct system state, never as the root cause or the postmortem. Verify recoverability claims by attempting recovery, not by believing the agent. Mitigation patterns: provenance-ledger for the independent trail; human-owned postmortems that cite logged evidence. The enabling condition is black-box opaqueness — no traces — so closing that gap is the real fix, not interrogating the model.

When to use

  • Reviewing an incident postmortem that is sourced from the agent's own account of what it did.
  • Recovery decisions are being driven by the agent's claims about recoverability.
  • No independent action trail exists and the self-report is the only narrative.

Open the full interactive page

Diagram, neighbourhood map, code examples, related patterns and full provenance.

Related