Anti-Patterns

Confident Inconsistency

Anti-pattern: in a regulated workflow the same query produces materially different outputs at different times, each looking correct and passing review, so the variance stays invisible unless outputs are deliberately re-run and compared across time.

Problem

Because each individual output looks correct and passes its single review, the fact that the same query would have produced a materially different answer at another time is never seen. The inconsistency generates no error signal — nothing is malformed, nothing throws — so it is invisible unless the organisation deliberately re-runs the query and compares outputs across time. In a regulated setting this means materially different determinations are made for equivalent inputs, each defensible in isolation, with the variance surfacing only under a deliberate consistency audit that single-run review never performs.

Solution

Make consistency a measured property, not an assumption. Re-run identical inputs and compare the outputs across time to quantify how much the same query varies, and classify the agent into a reproducibility tier from that measurement, requiring the strict tier for regulated decisions. Where determinism matters, pin it — fixed decoding, cached or replayed outputs for equivalent inputs — so the same input yields the same determination. Treat a material difference between two answers to the same query as a defect to investigate, even when each answer passes its own review, and audit consistency on a schedule rather than trusting that one good output implies a stable one. The control is cross-time comparison and a reproducibility requirement, not single-run inspection.

When to use

  • Recognising this failure when the same query yields materially different regulated outputs at different times, each passing review.
  • Reviewing a high-stakes workflow that accepts single-run outputs without measuring cross-time consistency.
  • Diagnosing inconsistent determinations for equivalent inputs that no error signal flagged.

Open the full interactive page

Diagram, neighbourhood map, code examples, related patterns and full provenance.

Related