Methodology · Multi-Agent Designprovenverified

Writer-Critic Iterative Loop Construction

also known as generator-judge construction, writer-reviewer pairing

Applies to: multi-agent-systemagentcoding-agent

Tags: writer-criticiterative-looprubricbounded-iteration

Build a loop with two agents. One agent produces something, such as code, a plan, a summary, or a draft. A separate agent checks it against a clear pass/fail rubric. They go back and forth until the checker passes or you hit a set number of rounds. This is how you build the generator-critic-separation pattern. It says how to wire the pair together, what the rubric has to contain, and how to cap the loop so it cannot run forever.

Methodology process overview

Intent. Wire a maker agent and a checker agent into a loop with a clear rubric and a hard round limit, so quality climbs through bounded review instead of a single shot.

When to apply. Use this when the output has checkable quality. For example, code that must pass tests, a plan that must meet named constraints, or a summary that must cover set points. And use it when one shot is not good enough. Apply it when you can actually write the rubric down. Don't apply it when 'good' is purely a matter of taste and the rubric boils down to 'the user likes it'. At that point you need a person in the loop, not a checker agent. One exception: even taste-based tasks can use a checker if you can spell out what to avoid.

Inputs

  • Task specificationWhat the maker must produce. Include the input format and what counts as success.
  • Pass/fail rubricThe checklist the checker applies. It must be clear enough that two sensible checkers would reach the same verdict.
  • Maximum iteration countA hard limit on the number of maker-checker rounds, usually 3 to 8. Past that limit the loop stops and returns the best draft so far.

Outputs

  • Generator agentThe agent that produces the output and takes in the checker's feedback on later rounds.
  • Critic agentThe agent that applies the rubric and returns pass or fail with structured feedback. It never quietly rewrites the output itself.
  • Loop controllerThe driver that connects maker and checker, enforces the round limit, and returns the best output if the loop runs out of rounds.

Steps (6)

  1. Author the rubric the critic will apply

    Write the rubric down before either agent exists. List what to check, the verdict shape (pass/fail or a score), and what counts as 'fail with hints' versus 'fail terminally'. A rubric written after the maker is skewed toward what the maker already does well.

  2. Build the generator agent

    On the first turn the maker produces the output from the task spec alone. On later turns it takes in the checker's structured feedback. The maker never grades itself.

    usesAugmented LLM

  3. Build the critic agent with role separation

    The checker only judges. It does not produce the output. It returns a structured verdict, and on a fail it returns the smallest useful hint that would let the maker improve. If one model both makes and checks, the two roles collapse into one agent. Use a different model, or at least a prompt that hard-separates the roles.

    usesGenerator-Critic SeparationTool-Augmented Self-Correction

  4. Wire the bounded loop

    The loop controller alternates maker and checker. It passes when the checker passes. It stops at the round limit and returns the highest-rated output. It emits a trace for each round so you can audit the loop afterward.

  5. Calibrate the rubric against ground truth

    Run a sample of outputs through both the checker and a human grader. Tighten the rubric until the checker's verdicts agree with the human at an acceptable rate. A poorly calibrated checker either passes too much, which gives false confidence, or fails too much, which exhausts the loop.

  6. Instrument failure modes

    Log round counts, exit reasons, and pass rates per task type. Watch for task types where the loop nearly always runs out of rounds. That signals the rubric cannot be met, or the maker cannot act on the checker's hints.

Framework-specific instructions

Pick a framework and generate a framework-targeted rewrite of this methodology's steps.

Choose framework

AI-generated for Agent Development Kit (ADK) (Google) — verify against official docs.

Principles

  • The checker and the maker are separate roles. One model doing both ceilings at that model's blind spots.
  • Freeze the rubric before you build either agent. A rubric written afterward is tainted by what the maker already does well.
  • Every loop has a hard round limit and a 'best so far' fallback. An open-ended loop is an outage waiting to happen.
  • On a fail, the checker returns useful hints, not just a verdict. The maker has to be able to revise on the next turn.

Known failure modes (3)

Related patterns (5)

Related methodologies (2)

Sources (2)

Provenance

  • Added to catalog:
  • Last updated:
  • Verification status: verified