Planner-Generator-Evaluator Harness
Decompose a long-running job into three role-isolated agents — a Planner emitting a feature list, a Generator working one chunk per fresh context, and an Evaluator grading against a rubric without seeing the Generator's trace.
Problem
A single agent trying to do all of this in one head hits context limits within a few hours and conflates planning, generation, and self-grading; its own scratch reasoning leaks into how it judges its work. A two-role loop where one agent generates and the other critiques lets the generator read the critic's notes as hints and game them. Generic orchestrator-worker decomposition does not name a grader role with hard isolation, so quality drifts run by run and there is no fixed place to enforce the acceptance bar. The team needs a three-way split where each role's context stays small, the grader cannot be socially engineered by the generator, and the plan survives across runs.
Solution
The Planner runs once (or rarely) and emits a structured feature-list artefact: ordered chunks, acceptance criteria, dependencies. The Generator is invoked per-chunk in a fresh context that includes only (a) the feature-list, (b) the current artefact state, and (c) the chunk to build; it produces a new artefact revision and exits. The Evaluator is invoked in its own fresh context with only the artefact and the fixed rubric; it returns pass/fail plus structured findings, and never sees the Generator's chain of thought or scratch notes. A small driver loop routes between the three: failed evaluation re-invokes the Generator with the findings as input (not the full Evaluator transcript). The fixed rubric makes Evaluator behaviour reproducible across runs.
When to use
- A single agent run cannot fit the job into one context window.
- There is a clear external artefact that can be evaluated without inspecting how it was produced.
- A stable rubric exists or can be authored.
Open the full interactive page →
Diagram, neighbourhood map, code examples, related patterns and full provenance.