Naive Retry Without Backoff
also known as Tight-Loop Retry, Thundering-Herd Retry
Anti-pattern: retry failed model or tool calls immediately, amplifying load on systems that are already failing.
Context
An agent calls a model API or downstream tool that returns 5xx, rate-limit, or timeout errors during a degradation event. The orchestrator wraps the call in a tight retry loop with no backoff and no jitter, often with a high or unbounded retry count.
Problem
The retry loop fires immediately on failure, so every instance of the agent piles onto the failing upstream at the same instant. Recovery is delayed because the upstream cannot drain its queue. When many agent instances share a backend, the retry storm itself becomes the outage. Distinct from unbounded-loop, which is about logical step counts; this is about call-attempt pacing inside one step.
Forces
- Transient errors are real and retries do help when paced sensibly.
- Tight retry loops are the default in many SDK code samples.
- Per-call backoff complicates timing analysis, but its absence breaks production.
Example
An ingestion agent processes 5,000 documents per hour. The embedding API enters a degraded window. The agent retries each failed call immediately, without backoff. Each running agent instance hammers the embedding API at full rate. The embedding API's queue never drains because the retry rate equals the success rate. The degradation lasts 4× longer than the underlying incident.
Diagram
Solution
Therefore:
Use exponential backoff (e.g. 1s, 2s, 4s, 8s, with ±25% jitter), cap attempt count (typically 3–5), and honor `Retry-After` headers. Distinguish retryable errors (5xx, 429, timeout) from non-retryable (4xx other than 429). Pair with circuit-breaker so once attempts exhaust, the agent stops calling the failing dependency entirely until a health probe succeeds.
What this pattern forbids. No useful constraint; the missing constraint is bounded-and-paced retry semantics.
And the patterns that stand alongside it, or against it —
- complementsCircuit Breaker★★— Stop calling a failing dependency for a cooldown period after error rates exceed a threshold.
- complementsMissing Idempotency on Agent Calls✕— Anti-pattern: retry state-mutating agent tool calls without idempotency keys, so retries multiply real-world side effects.
- complementsRate Limiting★★— Cap the number of requests, tokens, or tool calls per user (or session) within a time window.
- alternative-toUnbounded Loop✕— Anti-pattern: run the agent loop without a step budget and let model self-termination decide.
- complementsFallback Chain★★— Try a primary handler; on failure or low confidence, fall through to a sequence of fallback handlers.
Neighbourhood
Click any neighbour to follow the language. Scroll to zoom, drag to pan.