Mixture of Experts Routing

also known as MoE Routing (Agent-Level), Expert Selection

Route each request to one or more domain-expert agents, where each expert holds deep capability in a narrow area.

This pattern helps complete certain larger patterns —

specialisesRouting★★— Classify an incoming request and dispatch it to the specialist (lane / agent / model) best suited to handle it.

Context

A team is building one agent that serves users across several substantially different professional domains — for example legal questions, medical questions, financial planning, and technical support. Each of these domains has its own vocabulary, its own authoritative sources, and its own conventions for what a good answer looks like. A single shared prompt cannot credibly carry deep expertise in all of them at once because the prompt budget and the model's attention are finite.

Problem

A generalist agent ends up shallow in every domain: it knows enough legal language to sound competent but misses important distinctions a tax specialist would catch, and the same is true on the medical side. Users in specialist domains feel under-served and the team cannot improve any one domain without bloating the shared prompt with material that hurts the others. Adding more general examples does not fix the depth problem because the model is forced to flatten its expertise across the whole surface.

Forces

Expert maintenance scales with domain count.
Routing classification must match expert coverage.
Cross-domain queries challenge single-expert routing.

Example

A general legal assistant gives shallow answers on tax questions and shallow answers on employment questions because one prompt cannot hold deep knowledge of both. The team adopts mixture-of-experts-routing: a small router classifies each query by domain, and routes to a tax expert (specialised prompt, IRS-publication retrieval, fine-tuned model) or an employment expert (different prompt, NLRB and state-law retrieval). For ambiguous queries it routes to both and aggregates. Per-domain depth improves without bloating any single prompt.

Diagram

flowchart TD Q[Request] --> R[Router: classify domain] R --> Top{Top-1 or top-k?} Top -- top-1 --> E1[Expert: legal] Top -- top-1 --> E2[Expert: code] Top -- top-1 --> E3[Expert: medical] Top -- top-k --> Multi[Run multiple experts] Multi --> Agg[Aggregate outputs] E1 --> Out[Answer] E2 --> Out E3 --> Out Agg --> Out

Solution

Therefore:

Define experts (specialised system prompts, tool palettes, possibly fine-tuned models). A router classifies queries by domain. Route to one expert (top-1) or to multiple experts whose outputs are aggregated. Distinct from standard routing by emphasising deep specialisation per expert.

What this pattern forbids. Each request is bound to one or more named experts; generalist fallback is explicit, not default.

And the patterns that stand alongside it, or against it —

complementsSupervisor★★— Place a coordinating agent above a set of specialised agents and route work to them.
complementsRole Assignment★★— Assign each agent a named role (researcher, writer, critic, planner) with a role-specific prompt, tool palette, and acceptance criteria.
alternative-toDynamic Expert Recruitment·— Generate the agent team — role descriptions and instances — at run time based on the specific task, then adjust team composition between iterations based on evaluation feedback.
complementsTool/Agent Registry★— Maintain a single queryable catalogue of both available tools and available agents, with metadata (capability, cost, latency, quality) the agent can use to pick the right one for a task.
alternative-toRL-Trained Conductor Orchestrator·— Train a small meta-model with reinforcement learning to dispatch sub-tasks across a pool of frontier LLM workers, learning the communication topology end-to-end and allowing the conductor to recursively invoke itself as a worker.
complementsComplexity-Based Routing★— Estimate a request's difficulty up front and bind it to the cheapest model tier that can answer well, using an explicit complexity classifier as the routing key.
complementsTop-Tier Model For Everything (Cost)✕— Anti-pattern: route every request through the highest-tier model regardless of difficulty, treating cost as a model-choice problem instead of a routing one.

Neighbourhood

Click any neighbour to follow the language. Scroll to zoom, drag to pan.

Used in recipes

Routing & Fallback
optional

Used in frameworks

Genspark
first-class9 patternsModel-Vendor Agents★ emerging
Genspark Super Agent's headline architecture is a Mixture-of-Agents that orchestrates nine LLMs plus 80+ in-house tools, dynamically routing each subtask to the best-suited model.…

References

Mixture-of-Agents Enhances Large Language Model Capabilities
paper

Provenance

Source: patterns/mixture-of-experts-routing.md on GitHub · commit 4fa1213 · view history
Added to catalog: 2026-04-30
Last updated: 2026-05-21
Contribute: open an issue or PR at github.com/agentpatternscatalog/patterns.