Token-Economy Blindness
also known as No Per-Run Cost Cap, Cost-Blind Multi-Agent Loop
Anti-pattern: operate multi-agent loops with no per-run token budget or alarm, allowing recursive loops to silently accumulate $10k+ in undetected costs.
Context
A team runs a multi-agent research or analysis tool that recursively spawns sub-agents. There are no per-run cost ceilings, no per-tenant alarms, and the model gateway has no anomaly detection on token velocity. The 2026 German t3n incident report documents an 11-day undetected $47,000 runaway from a 4-agent recursive loop.
Problem
Cost can accumulate to five figures before anyone notices. Discovery happens via the monthly invoice, not via the system. Distinct from existing cost-observability (which is the positive pattern) and unbounded-loop (which is control-flow): this names the *cost-monitoring absence*, the failure to attach an economic ceiling per logical run.
Forces
- Multi-agent recursive loops are useful — capping them too tight defeats the point.
- Per-run budgeting requires routing every call through a billing-aware gateway.
- Token bursts look like normal usage until they exceed thresholds nobody set.
Example
A 4-agent research tool spawns sub-agents recursively. A loop forms in the planner. Each sub-agent costs $0.30. The loop runs for 11 days. Final invoice line: $47,000. The gateway logs the calls but has no per-run budget enforcement and no velocity alarm.
Diagram
Solution
Therefore:
Route every model call through a metering gateway that tracks tokens per run id. Set per-run budgets matched to expected output shape. Enforce hard termination at budget exhaustion. Alarm on velocity anomalies (e.g. tokens-per-minute exceeding mean+3σ for the run class). Pair with cost-observability (positive pattern) and step-budget.
What this pattern forbids. No useful constraint; the missing constraint is per-run economic ceilings with gateway enforcement.
And the patterns that stand alongside it, or against it —
- alternative-toCost Observability★★— Surface per-request, per-user, and per-feature cost and token consumption to operators in near-real-time.
- alternative-toCost Gating★★— Block actions whose expected cost exceeds a threshold without explicit user (or operator) acknowledgement.
- complementsStep Budget★★— Cap the number of tool calls or loop iterations the agent is allowed within a single request.
- complementsUnbounded Loop✕— Anti-pattern: run the agent loop without a step budget and let model self-termination decide.
- complementsMissing max_tokens Cap✕— Anti-pattern: call the model without an explicit max_tokens (or equivalent) so a single call can drain the run's budget on a runaway generation.
Neighbourhood
Click any neighbour to follow the language. Scroll to zoom, drag to pan.