Step Budget
also known as Max Steps, Iteration Cap, Loop Bound
Cap the number of tool calls or loop iterations the agent is allowed within a single request.
This pattern helps complete certain larger patterns —
- used-byOuter-Inner Agent Loop·— Run two nested loops: an outer planner agent decomposes the goal into subtasks; an inner executor runs a ReAct loop on each, and the outer can replan based on the inner's progress.
Context
A team runs an agent inside some kind of loop — a ReAct loop, a plan-execute loop, a multi-agent debate — where the model is invoked repeatedly to take more steps until it decides it is finished. Each loop iteration costs model tokens, tool-call money, and wall-clock time, and the loop has no naturally bounded length: the model itself decides when to stop. In real traffic, some sessions wander into pathological states where the model keeps deciding to take one more step.
Problem
If termination relies on the model saying 'I am done', then a confused, stuck, or over-eager agent will simply never declare itself done, and the loop runs until something else stops it — a timeout, a crash, or an angry invoice at the end of the month. The team has no way to bound the worst-case cost or latency of a single request, and one pathological session can burn through more budget than thousands of normal ones combined. Without a hard numeric cap that the loop respects regardless of the model's opinion, runaway behaviour is always one bad prompt away.
Forces
- Cap too low cuts off legitimate work.
- Cap too high lets pathological runs burn budget.
- What to do when hit (return partial? error?) is its own design choice.
Example
An autonomous bug-fixing agent is given a step budget of 30. After 30 rounds of think-act-observe, the loop halts even if the agent insists it is 'almost done.' This stops a confused agent from spinning forever and racking up a $50 OpenAI bill on what should have been a five-minute task.
Diagram
Solution
Therefore:
Define a numeric cap (max_steps=N) in the agent loop. Increment per tool call or per loop iteration. When N is hit, terminate the loop and return the best partial answer with a note that the cap was reached.
What this pattern forbids. The loop terminates after N iterations regardless of agent's own opinion.
The smaller patterns that complete this one —
- generalisesStop Hook★★— Define an explicit programmatic predicate that decides when the agent's loop should terminate.
- generalisesComposable Termination Conditions★— Express agent stop criteria as small single-purpose conditions composed with AND/OR into one explicit termination contract instead of ad-hoc loop guards.
And the patterns that stand alongside it, or against it —
- complementsCost Gating★★— Block actions whose expected cost exceeds a threshold without explicit user (or operator) acknowledgement.
- complementsHuman-in-the-Loop★★— Require explicit human approval at defined points before the agent performs an action.
- alternative-toInfinite Debate✕— Anti-pattern: launch multi-agent debate without a termination rule and watch the agents loop forever.
- alternative-toUnbounded Subagent Spawn✕— Anti-pattern: a supervisor or orchestrator spawns sub-agents that can themselves spawn sub-agents without a global cap.
- alternative-toUnbounded Loop✕— Anti-pattern: run the agent loop without a step budget and let model self-termination decide.
- complementsSpec-Driven Loop★— Run the same prompt against a fixed spec in a deterministic outer loop until the spec is satisfied.
- complementsPlan-and-Execute★★— Plan all the steps once with a strong model, then execute each step with a cheaper model under the plan.
- complementsStop / Cancel★★— Let the user interrupt an in-flight agent run cleanly, releasing resources and surfacing partial state.
- complementsAgent-as-Tool Embedding★— Wrap a sub-agent (with its own loop, prompt, and tool palette) behind a single function-shaped tool signature, so the parent agent calls it like any other tool and never sees the sub-agent's internal turns.
- complementsMode-Adaptive Cadence★— Vary the agent's loop interval based on current salience so the agent thinks faster when something is happening and slower when nothing is, instead of running on a fixed cron.
- complementsTyped Tool-Loop Failure Detector★— Lift tool-loop detection from prompt-level rules to a mechanical dispatch-boundary veto with typed failure modes and per-tool caps that returns a formatted refusal the model must consume.
- complementsIteration Node★★— Express map-over-collection inside a visual workflow as an explicit Iteration node that runs a subgraph once per element of an input array, with bounded, deterministic, observable execution.
- alternative-toDemo-to-Production Cliff✕— Anti-pattern: ship a demo-validated agent straight into production without a frozen eval, cost ceiling, loop-detector, or named oncall, then act surprised when accuracy drops and cost runs away.
- complementsToken-Economy Blindness✕— Anti-pattern: operate multi-agent loops with no per-run token budget or alarm, allowing recursive loops to silently accumulate $10k+ in undetected costs.
- complementsMissing max_tokens Cap✕— Anti-pattern: call the model without an explicit max_tokens (or equivalent) so a single call can drain the run's budget on a runaway generation.
- complementsCompound Error Degradation★— Anti-pattern: deploy a long-horizon agent without modelling that per-step accuracy multiplies across the trajectory.
Neighbourhood
Click any neighbour to follow the language. Scroll to zoom, drag to pan.