Hidden Mode Switching
also known as Silent Model Swap, Undisclosed Routing
Anti-pattern: silently swap the underlying model between requests without disclosing the change to users or operators.
Context
A team operates an agent or chat product under real cost and capacity pressure, and the obvious lever is to route some traffic to a smaller, cheaper model and the rest to the flagship. The routing is implemented as a backend decision: nothing in the response, the user interface, or the trace tells the user which model actually produced a given answer. Operators may also lack a per-request record of the resolved model identity.
Problem
When users compare runs over time, or compare two answers to the same prompt, they encounter quality differences they cannot explain — the agent feels sharper on Monday than on Saturday, code suggestions degrade overnight, and the same prompt produces different reasoning depth from one call to the next. They cannot reproduce results, cannot file a precise bug, and cannot trust evaluation numbers because the eval and the production traffic may have hit different models. Trust erodes faster than the cost savings accumulate.
Forces
- Cost arbitrage feels too good to disclose.
- Per-request model disclosure adds UI complexity.
- Hidden routing complicates eval gates.
Example
A coding-agent vendor silently routes nights and weekends to a smaller model to save cost. Users start filing bug reports about 'the model getting dumber on Saturday morning' and cannot reproduce them on Monday. The team realises they have been doing hidden-mode-switching as an unacknowledged anti-pattern and starts including the resolved model id in every response header and in the agent's own status line. Routing rules are published; users can pin a model if they need consistency. Trust climbs back.
Diagram
Solution
Therefore:
Don't. Disclose model identity per response. Use multi-model-routing transparently. Make routing decisions inspectable.
What this pattern forbids. By definition, this anti-pattern imposes no useful constraint; the missing constraint is the failure mode.
And the patterns that stand alongside it, or against it —
- alternative-toMulti-Model Routing★★— Send each request to the cheapest model that can handle it well.
- alternative-toLineage Tracking★★— Track which prompt version, model version, and data sources produced each agent output.
Neighbourhood
Click any neighbour to follow the language. Scroll to zoom, drag to pan.