Routing & Composition

Routing

Classify an incoming request and dispatch it to the specialist (lane / agent / model) best suited to handle it.

Problem

If every request goes through one all-purpose prompt that can handle the hardest case, the cheap and simple requests over-pay on tokens and latency for capabilities they never use. If every request goes through a prompt tuned for cheap cases, the complex requests are stuck without the planning and tools they need and the product feels incompetent on anything non-trivial. A single shared prompt forces the team to pay for the worst case on every request or under-serve the hard cases.

Solution

A lightweight classifier model (often the cheapest available) returns a label. The host dispatches the request to the specialist for that label. Common lanes: command (deterministic action), agent (multi-step), chat (no tools).

When to use

Traffic is heterogeneous and different requests benefit from different prompts or models.
A single all-purpose prompt is over-paying for cheap requests or under-serving complex ones.
A lightweight classifier can produce a stable label cheaply.

Open the full interactive page →

Diagram, neighbourhood map, code examples, related patterns and full provenance.

Problem

Solution

When to use

Related