Plan-Reason-Evaluate-Feedback Loop
also known as PREF loop, planning-reasoning-evaluation-feedback construction
Build the agent's control logic as a loop with four stages. Plan drafts a candidate approach. Reason fills it in using chain-of-thought or tree-of-thoughts. Evaluate scores the result, using self-consistency or a judge. Feedback hands the lessons back to Plan for the next round. Make each stage its own step with its own metrics, so the team can tune them one at a time. The thing to avoid is one giant prompt trying to do all four jobs and doing each one badly.
Methodology process overview
Intent. Split the agent's control loop into Plan, Reason, Evaluate, and Feedback so each one can be written, tested, and tuned on its own instead of crammed into a single prompt.
When to apply. Use this when the task is hard enough to need real planning and self-checking: research agents, coding agents, and multi-step problem solvers. It helps most when one big ReAct prompt has stopped improving. Don't apply it for single-shot generators and simple tool callers, where four stages on a trivial task is overkill. Skip it too when your latency budget cannot absorb the extra round-trips.
Inputs
- Task specification — What the agent must do, in a form you can break into sub-steps.
- Evaluator definition — The judge, rubric, or self-consistency check that will score what Reason produces.
Outputs
- Four-stage control loop — Plan, Reason, Evaluate, and Feedback as four separate stages, each with its own metrics.
- Stage-level metrics — A success or failure signal for each stage that your telemetry can graph and alert on.
Steps (6)
Author the Plan stage
Draft a candidate plan or approach. Use single-path or multi-path plan generation depending on cost and risk.
usesPlan-and-ExecuteSingle-Path Plan GeneratorMulti-Path Plan Generator
Author the Reason stage
Fill in the plan with chain-of-thought or tree-of-thoughts. Reasoning generates new detail. Do not fold it back into Plan.
Author the Evaluate stage
Score what Reason produced. Use self-consistency, an LLM judge, or a rubric. Evaluate must be independent enough to disagree.
Author the Feedback stage
Turn the score into a clear next step for the planner: accept, reject with reasons, replan, or escalate. Feedback is what closes the loop.
Instrument each stage independently
Emit traces and metrics scoped to each stage. You cannot tune the four stages if they share one set of telemetry.
Bound iterations
Add a max-iterations budget and a test for when to stop. Otherwise the loop can bounce between plan and reflection forever.
usesStep Budget
Framework-specific instructions
Pick a framework and generate a framework-targeted rewrite of this methodology's steps.
Choose framework
AI-generated for Agent Development Kit (ADK) (Google) — verify against official docs.
Principles
- Plan, Reason, Evaluate, and Feedback are four jobs. Give each its own prompt and metric.
- The evaluator must be able to disagree. Same-model self-critique is a failure mode, not a method.
- Feedback closes the loop, or it is not feedback.
- Bound the iterations. Set a budget and a test for when to stop.
Known failure modes (3)
- ✕Same-Model Self-Critique
Evaluate uses the same model and context as Reason and rubber-stamps every result.
- ✕Infinite Debate
Feedback keeps reopening the plan without convergence; missing iteration budget or convergence test.
- ✕Unbounded Loop
Feedback returns to Plan unboundedly because no stage has authority to declare done.
Related patterns (8)
- ★★Plan-and-Execute
Plan all the steps once with a strong model, then execute each step with a cheaper model under the plan.
- ★★Chain of Thought
Elicit multi-step reasoning by prompting the model to produce intermediate steps before its final answer.
- ★Tree of Thoughts
Search over a tree of partial reasoning states with explicit lookahead, evaluation, and backtracking.
- ★★Self-Consistency
Sample the same question multiple times at non-zero temperature and aggregate by majority or judge to mitigate hallucination.
- ★Agent-as-a-Judge
Evaluate an agent's full trajectory (steps, tool calls, intermediate states) by another agent rather than scoring only the final output.
- ★★Evaluator-Optimizer
One LLM generates; another evaluates and feeds back; loop until criteria are met.
- ·Reflexion
Have the agent write linguistic lessons from past failures and consult them in future episodes.
- ★★Step Budget
Cap the number of tool calls or loop iterations the agent is allowed within a single request.
Related compositions (2)
- recipe · abstract shapePlanning Loops
Different ways to structure 'think then act': linear ReAct, plan-then-execute, parallel DAG planning, tree search with backtracking, and the outer/inner planner+executor split.
- recipe · abstract shapeReflection & Self-Correction
Patterns where the model reviews its own work before shipping it: scoped rubric reflection, self-refine, deterministic post-checks, process rewards.
Related methodologies (2)
- SPAR Agent Loop Design★
Give every agent the same four named phases, Sense, Plan, Act, and Reflect, so behaviour, traces, and failures line up with a phase instead of hiding in one murky loop.
- Agentic Workflow Construction★★
Make agent authors name the four parts and the freedom level before they code, so a failure points to one part instead of smearing across a vague agent.
Sources (2)
AI Agents in Action
Ch 11 'Agent planning and feedback', §11.5 'Applying planning, reasoning, evaluation, and feedback to assistant and agentic systems' “11.5 Applying planning, reasoning, evaluation, and feedback to assistant and agentic systems ... 11.5.1 Application of assistant/agentic planning ... 11.5.2 Application of assistant/agentic reasoning ... 11.5.3 Application of evaluation to…”
The Four Pillars of Effective AI Agents: Planning, Reasoning, Evaluation, and Feedback
“planning, reasoning, evaluation, and feedback ... tackle complex challenges, learn from experiences, and continuously improve”
Provenance
- Added to catalog:
- Last updated:
- Verification status: verified