Reflection
also known as Self-Critique, Single-Pass Self-Review
Have the model review its own output and produce a revised version in one or more passes.
This pattern helps complete certain larger patterns —
- specialisesEvaluator-Optimizer★★— One LLM generates; another evaluates and feeds back; loop until criteria are met.
- used-byAgentic RAG★★— Replace static retrieve-then-generate with autonomous agents that plan, choose sources, retrieve iteratively, reflect, and re-query.
- used-bySelf-RAG★— Fine-tune the model to emit reflection tokens that decide when to retrieve, evaluate retrieved relevance, and assess generated support.
Context
A team runs a large language model on a generation task (drafting an email, writing a function, composing a press release) where the first-pass output usually contains errors that a careful second read would catch: a missing edge case, a clumsy phrase, a factual slip. Latency and cost budgets allow at least one extra model call per output. The team is not asking for deep correctness verification, just a 'look it over' pass before shipping.
Problem
One-shot generation underuses the model in a specific way: the model has the ability to spot its own surface errors when it is asked to look at a finished draft, but in a single forward pass it commits to tokens without the opportunity to review what it has written. Without a separate critique step, obvious local mistakes ship even when the model could have caught them. A naive free-form critique pass helps a little but invents new criteria on each call, so reviews are inconsistent, and after one or two iterations the same model just starts approving its own work. The team needs structure around the critique step to make it actually catch errors instead of rubber-stamping.
Forces
- Same-model self-critique misses correlated blind spots.
- Free-form review drifts; the model invents new criteria each time.
- Termination: when does the loop stop?
Example
A drafting agent writes a press release in one shot; legal flags two compliance issues post-hoc. The team adds a critic pass: after the first draft, the same model is prompted as a compliance reviewer to list concrete issues, then a third pass rewrites against that critique. With one extra round-trip, most legal-flag issues are caught before legal sees the draft. The team caps it at two reflection passes to control cost.
Diagram
Solution
Therefore:
After producing an output, the model is prompted (often as a critic persona) to find issues. The original output and critique go back into a revision step. Repeat until a stop condition (no new issues, max iterations).
What this pattern forbids. The reviewer may only critique against criteria fixed by the surrounding system; free-form criteria invention is forbidden when the pattern is used at a correctness boundary.
The smaller patterns that complete this one —
- generalisesFrozen Rubric Reflection★— Constrain reflection to a fixed, hand-authored rubric of criteria so the reviewer cannot invent new ones each run.
- generalisesReflexion·— Have the agent write linguistic lessons from past failures and consult them in future episodes.
- generalisesChain of Verification★— Reduce hallucination by drafting an answer, generating independent verification questions, answering them in isolation, and revising.
- generalisesSelf-Refine★★— Iterate generate → feedback (same model) → refine until a stop criterion fires, with no separate critic model.
- generalisesTool-Augmented Self-Correction★— Self-correct LLM outputs by interactively critiquing them with external tools (search, code execution, calculator).
- generalisesCross-Reflection★— Reflection step performed by a *different* agent or foundation model from the original generator, so critique error is decorrelated from generation error.
- generalisesGenerator-Critic Separation★— Strict role separation between a Generator agent that produces drafts and a Critic agent that judges them against pre-defined criteria; the Critic never generates.
- generalisesHuman Reflection★— Reflection loop that explicitly collects human feedback (not approval) on agent plans to improve them, distinct from approval gates where the human only says yes/no.
And the patterns that stand alongside it, or against it —
- alternative-toSame-Model Self-Critique✕— Anti-pattern: have the same model both produce an answer and critique it, expecting independence.
- complementsCommitment Tracking·— Extract stated intents from each agent turn into a structured ledger with open / followed-through / expired status, making the gap between promise and follow-through visible and auditable.
Neighbourhood
Click any neighbour to follow the language. Scroll to zoom, drag to pan.