Verification & Reflection

Tool-Augmented Self-Correction

Self-correct LLM outputs by interactively critiquing them with external tools (search, code execution, calculator).

Problem

When self-critique is done by the same model that produced the draft and is not allowed to consult any external tool, the critique recycles the same blind spots that produced the original error. The model that confidently asserted a wrong fact will confidently agree with itself when asked to review the assertion. Without a way to compare the draft against an outside source of truth, the iterative loop is a model talking to itself and slowly converging on whatever it believed at the start. The team needs the critic to be able to actually test claims, not just re-read them.

Solution

After draft generation, the model emits a critique that names suspected errors and queries tools to verify. Tool results inform the revised output. Iterate until tools find no more issues or budget exhausted.

When to use

  • The model has external tools (search, code, calculator) that can produce grounded ground-truth signals.
  • Self-critique without tools recycles the model's blind spots and fails to catch real errors.
  • Iteration to convergence (or a budget cap) is acceptable in the latency model.

Open the full interactive page

Diagram, neighbourhood map, code examples, related patterns and full provenance.

Related