Anti-Patterns

Tool-Output Arithmetic Trust

Anti-pattern: the agent compares, ranks, or sums correctly returned tool data in its own head instead of offloading the computation to a deterministic tool, emitting confident wrong aggregates.

Problem

Token-by-token generation is not arithmetic. When the model performs comparison, ranking, or addition over tool data inside its own reasoning rather than in a deterministic step, it produces answers that read as authoritative but are numerically wrong: a mis-sorted ranking, a total that is off, a wrong cheapest pick. The data was right and the tool was right, so nothing in the trace flags the error, and the confident wrong aggregate flows straight to the user or into the next decision.

Solution

The corrected stance is to route every aggregate over tool data through a deterministic step rather than the model's free-form output. After a tool returns rows, the agent passes them to a calculator, a code-execution sandbox, a sort or filter primitive, or a query, and reads back the computed result; the model's job is to choose what to compute and how to phrase the answer, not to be the adder or the comparator. Where a single deterministic step is impractical, the aggregate is at least recomputed and cross-checked before it is reported, so a numeric claim never rests solely on token generation.

When to use

  • Watch for this when an agent answers with totals, rankings, or comparisons derived from tool data but no deterministic compute step appears in the trace.
  • Watch for it when answers are correct on small inputs but degrade on larger result sets.
  • Watch for it when a numeric recommendation drives a downstream action and there is no read-back of the computed value.

Open the full interactive page

Diagram, neighbourhood map, code examples, related patterns and full provenance.

Related