Errors Swept Under the Rug
also known as Error Hiding, Failure Erasure, Clean Trace Anti-Pattern
Anti-pattern: scrub failed actions, stack traces, and error observations from the agent's own context so the trace looks clean, leaving the model with no evidence of what did not work.
Context
An agent takes many tool actions per task and naturally accumulates failures — a tool returns an HTTP 500, a command exits non-zero, an API call is rejected. The team wants short, tidy prompts and clean-looking transcripts, so the wrapper either retries silently, replaces the failed tool output with a generic placeholder like 'retrying...', or strips stack traces before they ever reach the model's context. The intent is usually a mix of cosmetics, token economy, and a feeling that errors are noise.
Problem
The error message, stack trace, or rejection reason is exactly the signal the model needs to revise its plan and stop repeating the same call. When it is scrubbed before re-prompting, the agent re-attempts the failed action turn after turn, sometimes in tight loops, because nothing in its visible context contradicts the choice. After-the-fact debugging is also harder, because the transcript no longer shows whether a run succeeded cleanly or was salvaged across several hidden failures.
Forces
- Failed turns inflate context length and look untidy in transcripts.
- Retries are easier to log as a single clean event than as fail-then-retry.
- Models are sensitive to recency and adapt when they see the wrong turn explicitly.
- Compliance reviewers may misread visible errors as system bugs rather than agent learning.
Example
An ops agent calls a deployment tool that fails with a 500. The wrapper catches the error, replaces the failed observation with a generic 'retrying...' string, and lets the agent try again. The agent retries the same call eight times because the context shows eight clean attempts in progress and no evidence that anything is wrong. The team flips the policy: the failed response body, status code, and stack trace are inserted verbatim into the agent's transcript. On the next run the agent reads the 500, switches to the documented fallback endpoint, and succeeds in two steps.
Diagram
Solution
Therefore:
Don't. Treat failure observations as load-bearing context, not noise. Preserve stack traces, tool-error returns, and rejection messages in the agent's running transcript. Compress only after the run is done, not mid-loop. See decision-log and provenance-ledger for keeping the audit trail separate from the working context.
What this pattern forbids. By definition, this anti-pattern imposes no useful constraint; the missing constraint — that failure observations must remain in context — is the failure mode.
And the patterns that stand alongside it, or against it —
- alternative-toDecision Log★★— Persist the agent's reasoning trace alongside its actions so post-hoc review can explain why.
- alternative-toProvenance Ledger★★— Log every agent decision and state change with enough metadata to explain or reverse it later.
- alternative-toReplan on Failure★★— Trigger a fresh planning step when execution evidence contradicts the current plan.
- complementsUnbounded Loop✕— Anti-pattern: run the agent loop without a step budget and let model self-termination decide.
- complementsDemo-to-Production Cliff✕— Anti-pattern: ship a demo-validated agent straight into production without a frozen eval, cost ceiling, loop-detector, or named oncall, then act surprised when accuracy drops and cost runs away.
- alternative-toRigor Relocation★— Relocate verification rigor from the model loop to surrounding scaffolding (evals, judges, decision logs, policy gates) so failures are caught by the wrapper rather than the agent.
- complementsHidden State Coupling✕— Anti-pattern: agent workflows read or write undeclared shared state (caches, env vars, process globals) instead of explicit inputs and outputs.
Neighbourhood
Click any neighbour to follow the language. Scroll to zoom, drag to pan.