IX · Routing & CompositionMature★★

Graceful Degradation

also known as Feature-Level Fallback, Degraded Mode

When a dependency fails, downgrade the user-facing experience to a working subset rather than failing entirely.

This pattern helps complete certain larger patterns —

Context

A user-facing agent product combines several optional capabilities — a retrieval-augmented-generation backend that produces citations, a vision model that reads screenshots, a sandbox that runs user code, a payment integration. Each of these dependencies can have its own bad day independently of the others. The product is more than the sum of any single capability and can produce something useful even when one piece is missing.

Problem

If the product treats every dependency as load-bearing and fails the whole request when any one of them is down, an isolated vendor outage becomes a complete product outage from the user's point of view. If it silently drops the failing capability and ships whatever it can produce without disclosure, the user gets a worse answer than expected without knowing why and loses trust the next time it happens. Without a defined per-feature fallback, neither outcome is acceptable.

Forces

  • Degradation paths multiply test surface.
  • User-visible degradation messaging is its own UX problem.
  • Some failures must hard-fail (PII path, payment).

Example

A multimodal customer-support bot relies on a vision model to read screenshots, a vector store for citations, and a code sandbox for repro. During an outage of the vision provider, every screenshot upload returns a 503 and the whole conversation errors out. The team adds graceful degradation: when vision fails the bot falls back to asking the user to describe the screenshot in words and tells them so plainly; when retrieval is down it answers from the model's own knowledge with a visible 'no sources today' badge. Outages now feel like reduced service rather than total failure.

Diagram

Solution

Therefore:

Define per-feature fallback behaviour. On dependency failure, downgrade (text-only when vision fails, no citations when retrieval fails, simple summary when code execution fails) and disclose to the user that degraded mode is active. Feature flags double as degradation switches.

What this pattern forbids. On failure, the agent must produce a degraded response with disclosure rather than a generic error.

The smaller patterns that complete this one —

  • usesCircuit Breaker★★Stop calling a failing dependency for a cooldown period after error rates exceed a threshold.

And the patterns that stand alongside it, or against it —

  • complementsFallback Chain★★Try a primary handler; on failure or low confidence, fall through to a sequence of fallback handlers.
  • complementsInfrastructure Burst Bottleneck (Agent Scale-Out)Anti-pattern: deploy agents whose scale-out behavior triggers sudden data-and-compute bursts that on-prem or under-provisioned cloud infrastructure cannot absorb; agents work at small scale and freeze in production.

Neighbourhood

Click any neighbour to follow the language. Scroll to zoom, drag to pan.