IX · Routing & CompositionMature★★

Fallback Chain

also known as Cascade Fallback, Try-Then-Try-Else, Tool Failed Fall Back, Provider Failed Retry Other

Try a primary handler; on failure or low confidence, fall through to a sequence of fallback handlers.

This pattern helps complete certain larger patterns —

  • used-byOpen-Weight CascadeBuild a multi-model cascade where lower tiers are open-weight, self-hostable models that run inside the operator's boundary, and only escalations cross to a hosted frontier model — giving cost arbitrage *and* sovereignty.
  • used-byAgentic Behavior Tree·Borrow the behavior-tree formalism: leaves are LLM calls or tools that return success/failure; a tree of selectors and sequences orchestrates control flow.

Context

An agent in production depends on at least one model or tool that can fail for routine reasons: rate limiting, vendor errors, regional incidents, or outputs the model itself returns with low confidence. End users are sitting on the other end of the call expecting an answer regardless of which upstream had a bad minute. The team has more than one option available — a backup model, a smaller local model, a deterministic rule-based fallback — but those options are not wired in by default.

Problem

When the single primary handler fails, the user sees an outage even though other working handlers exist in the system. When the primary returns a low-confidence answer, the product silently ships a degraded response with no signal that something better could have been tried. Without a defined ordering of handlers and a rule for moving between them, every team improvises on each incident and quality regressions in the primary go unnoticed.

Forces

  • Fallback handlers may be slower or worse.
  • Detecting 'failure' requires a confidence signal.
  • Cascade depth must be bounded.

Example

A translation feature uses a primary high-quality model, but during incidents that model returns 502s and users see error messages. The team configures a Fallback Chain: try the primary model, on failure or low-confidence output try a secondary model, on failure of that try a smaller local model with a 'degraded quality' indicator. The user gets a translation in every case; the team gets visibility into how often each layer is used.

Diagram

Solution

Therefore:

Define an ordered chain of handlers. Each handler returns either a confident answer or a failure/low-confidence signal. On failure, the next handler runs. Final fallback is a generic 'I don't know' rather than a wrong answer.

What this pattern forbids. Each handler may produce a result or pass; only the chain may decide to terminate.

The smaller patterns that complete this one —

  • generalisesProvider Fallback★★When one provider's API errors mid-stream, transparently switch to another provider while preserving state.

And the patterns that stand alongside it, or against it —

  • complementsRouting★★Classify an incoming request and dispatch it to the specialist (lane / agent / model) best suited to handle it.
  • composes-withCircuit Breaker★★Stop calling a failing dependency for a cooldown period after error rates exceed a threshold.
  • complementsMulti-Model Routing★★Send each request to the cheapest model that can handle it well.
  • complementsConfidence ReportingSurface the agent's uncertainty about its answer alongside the answer itself.
  • complementsException Handling and Recovery★★Catch and react to predictable failure modes (tool errors, rate limits, validation failures) with structured recovery paths.
  • complementsGraceful Degradation★★When a dependency fails, downgrade the user-facing experience to a working subset rather than failing entirely.
  • complementsComplexity-Based RoutingEstimate a request's difficulty up front and bind it to the cheapest model tier that can answer well, using an explicit complexity classifier as the routing key.
  • complementsNaive Retry Without BackoffAnti-pattern: retry failed model or tool calls immediately, amplifying load on systems that are already failing.

Neighbourhood

Click any neighbour to follow the language. Scroll to zoom, drag to pan.

Used in recipes

Used in frameworks

References

Provenance