Teach the Failure Modes
also known as agent reliability training, failure mode curriculum, anti-pattern drill, what-goes-wrong workshop
A training session that drills builders on where production agents break: tool hallucination, instruction drift, infinite loops, non-idempotent replay, and metrics blindness. Builders learn to design for failure from the start, not after deployment.
How the learner advances
Intent. Give builders a working mental model of how production agents fail so they instrument guards before deployment rather than discovering failure modes in production.
When to apply. Use this move immediately after a team has built their first working agents and is preparing to move them toward production. It is poorly timed before the builder has any agent-building experience — the failure modes are abstract without a codebase to apply them to. It is dangerously late if applied only after a production incident.
Threshold — earns the next step. The builder can name at least five canonical failure modes, explain the mechanism behind each, and has added at least two guards to their own production-candidate agent.
Masterpiece — the artifact that proves it. The participant's production-candidate agent updated with at least two code-level guards — each targeting a specific failure mode they reproduced in the sandbox — plus a written entry in the team failure mode register documenting what was triggered and how it was mitigated.
Facets
- Container — workshop
- Mode — concepthands-on-buildbyo-problem
- Reach — team
- Persona — builder
- Craft (AI Fluency) — discernmentdiligence
- Guardrail — riskresponsible-use
Inputs
- Builders with at least one working agent — Participants who have built and deployed an agent at prototype level so they can map each failure mode to their own code rather than to an abstract example.
- Catalogue of canonical failure modes — A curated list of real production agent failures: tool hallucination, instruction drift, infinite loops, over-eager execution, false success, state desync, flaky tools, non-idempotent replay, permission creep, silent degradation, and metrics blindness.
- Real production traces or logs — Actual traces or sanitised logs of the failure modes occurring in real systems — not slides describing what could go wrong but evidence of what did go wrong.
- Sandbox environment for failure reproduction — A safe environment where builders can deliberately trigger each failure mode in their own or a provided agent and then implement a guard.
Outputs
- A more capable learner — A builder who can name at least five canonical agent failure modes, explain the mechanism behind each one, and describe the corresponding guard or mitigation.
- Masterpiece: a guarded agent — The participant's agent — from the preceding build sprint — updated to include at least two production guards: for example, an idempotency check and a loop counter, or an output schema validator and an error-budget circuit breaker.
- Team failure mode register — A shared document in which the team records the failure modes they reproduced, the guards they implemented, and any failure modes specific to their domain or stack.
Steps (4)
Enumerate and explain the canonical failure modes
Walk through the full catalogue — tool hallucination, instruction drift, infinite loops, over-eager execution, false success, state desync, flaky tools, non-idempotent replay, permission creep, silent degradation, metrics blindness. For each, show a real production trace or log of it happening, not a slide about what could happen. The goal is recognition, not theory.
Reproduce a failure in the sandbox
Builders choose one failure mode and deliberately trigger it in a sandbox agent. Triggering a loop counter bypass, a non-idempotent write, or a hallucinated tool call in controlled conditions is qualitatively different from reading about it. Builders document what they observe: what the agent said, what it did, and what a user would have seen.
Implement and verify a guard
For the failure mode they triggered, each builder implements a guard: a loop counter, an idempotency token, an output schema check, a circuit breaker. They verify that the guard catches the failure mode when triggered again. The guard must be in code, not a comment or a reminder.
Red-team or debrief
Optionally, split into two teams: one builds an agent and the other tries to trigger any failure mode in it within a time box. Alternatively, run a debrief focused on the question: 'which of these failure modes will show up in your production agent first?' Each team leaves with a prioritised guard list.
Principles
- Show real traces, not slides — a sanitised production log is worth more than any hypothetical example.
- Fail loudly in staging, not silently in production — every guard exists to surface a failure before a user sees it.
- Reproduce before you guard — a guard written against a failure mode the builder has never seen is likely incomplete.
Unlocks methodologies (2)
A learner who completes this pattern is equipped to execute these methodology families:
Known uses (4)
Agent Reliability Gap — 12 Early Failure Modes — Quaxel (practitioner blog)
neutral Practitioner writeup cataloguing 12 failure modes; widely cited in 2025 enterprise agent deployments
From Failure Modes to Reliability Awareness in Generative and Agentic AI Systems — arXiv (academic)
neutral 11-layer failure stack; introduces awareness mapping as a maturity tool; integrates DCAM framework
Hugging Face AI Agents Course — Bonus Unit 2: Observability and Evaluation — Hugging Face
neutral Covers tracing, token-cost monitoring, LLM-as-judge, offline benchmarking; part of free cert track
AIエージェント開発者養成講座実践コース — Trainocate Japan — Trainocate
in-house 3-day Japanese practitioner course; includes tracing, evaluation, and operational failure considerations; lang: ja
Known failure modes (2)
- [theory-only-drill]
The anti-pattern of running the workshop without real traces and without the sandbox reproduction step. Builders who only read about failure modes acquire vocabulary but not recognition skill — the next time the failure mode appears in their code, they will not identify it.
- [guard-without-test]
The anti-pattern of writing a guard and not verifying that it catches the failure mode when triggered. Guards written from memory rather than from a reproduced failure are often off-target and give a false sense of safety.
Related trainings (3)
- Agent-Build Course★★
Graduate a builder who can identify, implement, and combine the four foundational agentic design patterns in a working, deployed agent.
- Show the Working★
Teach builders to instrument their agents with human-readable reasoning traces so end users can verify agent behaviour without reading code or logs.
- Agent-Builder Dojo★
Ship at least one production-candidate agent per participant in a compressed, high-accountability build environment where the facilitator unblocks rather than lectures.
Sources (2)
https://medium.com/@Quaxel/the-agent-reliability-gap-12-early-failure-modes-91dba5a2c1ae
“Most agent failures are boring, not mysterious. They're the same categories we've dealt with for decades — just wrapped in a more flexible planner.”
https://arxiv.org/abs/2511.05511
“failures rarely occur in isolation but propagate across layers, creating cascading effects with systemic consequences”
Provenance
- Ecosystem: neutral
- Added to catalog:
- Last updated:
- Verification status: verified