XIV · Anti-PatternsAnti-pattern

Token-Economy Blindness

also known as No Per-Run Cost Cap, Cost-Blind Multi-Agent Loop

Anti-pattern: operate multi-agent loops with no per-run token budget or alarm, allowing recursive loops to silently accumulate $10k+ in undetected costs.

Context

A team runs a multi-agent research or analysis tool that recursively spawns sub-agents. There are no per-run cost ceilings, no per-tenant alarms, and the model gateway has no anomaly detection on token velocity. The 2026 German t3n incident report documents an 11-day undetected $47,000 runaway from a 4-agent recursive loop.

Problem

Cost can accumulate to five figures before anyone notices. Discovery happens via the monthly invoice, not via the system. Distinct from existing cost-observability (which is the positive pattern) and unbounded-loop (which is control-flow): this names the *cost-monitoring absence*, the failure to attach an economic ceiling per logical run.

Forces

  • Multi-agent recursive loops are useful — capping them too tight defeats the point.
  • Per-run budgeting requires routing every call through a billing-aware gateway.
  • Token bursts look like normal usage until they exceed thresholds nobody set.

Example

A 4-agent research tool spawns sub-agents recursively. A loop forms in the planner. Each sub-agent costs $0.30. The loop runs for 11 days. Final invoice line: $47,000. The gateway logs the calls but has no per-run budget enforcement and no velocity alarm.

Diagram

Solution

Therefore:

Route every model call through a metering gateway that tracks tokens per run id. Set per-run budgets matched to expected output shape. Enforce hard termination at budget exhaustion. Alarm on velocity anomalies (e.g. tokens-per-minute exceeding mean+3σ for the run class). Pair with cost-observability (positive pattern) and step-budget.

What this pattern forbids. No useful constraint; the missing constraint is per-run economic ceilings with gateway enforcement.

And the patterns that stand alongside it, or against it —

  • alternative-toCost Observability★★Surface per-request, per-user, and per-feature cost and token consumption to operators in near-real-time.
  • alternative-toCost Gating★★Block actions whose expected cost exceeds a threshold without explicit user (or operator) acknowledgement.
  • complementsStep Budget★★Cap the number of tool calls or loop iterations the agent is allowed within a single request.
  • complementsUnbounded LoopAnti-pattern: run the agent loop without a step budget and let model self-termination decide.
  • complementsMissing max_tokens CapAnti-pattern: call the model without an explicit max_tokens (or equivalent) so a single call can drain the run's budget on a runaway generation.

Neighbourhood

Click any neighbour to follow the language. Scroll to zoom, drag to pan.

References

Provenance