Verification & Reflection

Frozen Rubric Reflection

Constrain reflection to a fixed, hand-authored rubric of criteria so the reviewer cannot invent new ones each run.

Problem

When the reviewer is given a free-form instruction like 'review this output and flag any issues', it invents fresh criteria on every call: today it notices tone, tomorrow it notices grammar, the day after it notices factual claims. Reviews stop being comparable across runs because they were not measuring the same thing. The reviewer also tends to drift over time, gradually narrowing its attention onto whatever issue it last saw and forgetting categories it used to check. The team has no stable answer to the question 'what did the reviewer actually look for on this run?', which makes the reviewer useless for audit and unreliable as a gate.

Solution

A fixed rubric file (or schema) lists exactly the categories the reviewer may flag. The reviewer prompt includes the rubric and a JSON Schema enforcing it. Temperature is zero. Output validates against the schema; new finding categories are rejected.

When to use

Review criteria should be stable across runs so verdicts compare.
Auditors need an explicit list of categories the model checked.
Reflection drift across calls is producing inconsistent reviews.

Open the full interactive page →

Diagram, neighbourhood map, code examples, related patterns and full provenance.

Problem

Solution

When to use

Related