I · ReasoningEmerging

Chain of Verification

also known as CoVe, Factored Verification, Verify Before Answering

Reduce hallucination by drafting an answer, generating independent verification questions, answering them in isolation, and revising.

This pattern helps complete certain larger patterns —

  • specialisesReflection★★Have the model review its own output and produce a revised version in one or more passes.

Context

A team is using a large language model to produce long-form factual writing: a biography of a person, a summary that names specific entities and dates, or a recommendation that cites particular products, papers, or sources. The output reads fluently and confidently, but a careful reader inspecting individual sentences finds claims that are subtly or completely wrong — a wrong birth year, an invented citation, a made-up product feature, a confidently asserted fact that does not exist.

Problem

When the same model is then asked to check its own draft within the same conversation, it sees the draft text in its context window. Its follow-up answers are pulled towards agreeing with what was just written, so the same wrong claims get reaffirmed instead of caught. Simply telling the model 'now check this for errors' does not work, because the draft itself biases the verifier, and the hallucinations slip through into the final output.

Forces

  • Verification questions must be independently answerable.
  • Joint verification (all questions in one prompt) underperforms factored.
  • Verification cost scales with question count.

Example

A research agent confidently lists five 'recent papers' on a niche topic, two of which don't exist. Asking the model to check its own draft in the same conversation just produces equally confident reaffirmations. The team applies Chain-of-Verification: after the draft, the system generates verification questions about each citation, answers each one in a fresh context with no view of the draft, and revises. Fabricated citations get exposed because the verifier never saw the wrong claim to begin with.

Diagram

Solution

Therefore:

Four-step pipeline. Draft: produce initial answer. Plan: generate verification questions covering claims in the draft. Execute: answer each question in isolation, without seeing the original draft. Revise: rewrite the draft using the verification answers.

What this pattern forbids. Verification answers are produced without the draft in context; coupled verification is not permitted.

And the patterns that stand alongside it, or against it —

  • complementsSelf-Consistency★★Sample the same question multiple times at non-zero temperature and aggregate by majority or judge to mitigate hallucination.
  • composes-withNaive RAG★★Condition the generator on top-k chunks retrieved from an external dense index so knowledge lives outside parameters.
  • alternative-toTool-Augmented Self-CorrectionSelf-correct LLM outputs by interactively critiquing them with external tools (search, code execution, calculator).
  • complementsHypothesis Tracking·Persist the agent's candidate provisional answers as a typed ledger of records carrying summary, confidence, status, and next-test, so guesses survive sessions and stay distinguishable from open questions.

Neighbourhood

Click any neighbour to follow the language. Scroll to zoom, drag to pan.