VII · Verification & ReflectionEmerging

Confidence-Checking Workflow

also known as Per-Part Confidence Annotation, Junior-Analyst Triage

Always ask the agent, for each part of its output, to state its confidence and identify which parts need human verification, like triaging a junior analyst's work.

Context

The agent produces analyses (financial, medical, research) with mixed-confidence parts. The user takes the output as homogeneous. Confident-sounding false claims (false-confidence-syndrome) get equal trust as well-grounded conclusions. Errors slip through where the user lacks the expertise to spot them.

Problem

A homogeneous output hides per-part confidence variation. The user has no signal to apply expertise selectively. The agent has the information (it 'knows' where it is uncertain) but defaults to confident prose throughout.

Forces

  • Per-part confidence is awkward in narrative outputs.
  • Asking for confidence adds prompt complexity and output size.
  • Calibrated confidence is itself unreliable (false-confidence-syndrome).

Example

A financial agent produces an acquisition analysis. Output sections: revenue projection (confidence: HIGH, no flag), synergy estimate (confidence: MEDIUM, verify), regulatory risk (confidence: LOW, verify-required). The CFO spends 90% of review time on the LOW-confidence regulatory risk section — which is where a flaw is in fact found. Without the workflow, the same review time would have been distributed uniformly and the regulatory flaw missed.

Diagram

Solution

Therefore:

Modify the agent's output template to require per-part annotations: each conclusion / fact / recommendation tagged with confidence (high/medium/low or numeric) and a 'verify' flag for the riskiest parts. The user UI surfaces these annotations prominently. Time saved is spent on the flagged parts, not on full re-verification. Pair with confidence-reporting, false-confidence-syndrome (the failure this addresses), reflexive-metacognitive-agent.

What this pattern forbids. Analytical outputs must carry per-part confidence and verify flags; uniform-prose outputs are not accepted for downstream decisions.

And the patterns that stand alongside it, or against it —

  • complementsConfidence ReportingSurface the agent's uncertainty about its answer alongside the answer itself.
  • alternative-toFalse Confidence SyndromeAnti-pattern: the model produces incorrect answers with the same high confidence as correct ones, failing to vary its expressed certainty with its actual reliability — Oxford-documented for constraint-heavy prompts.
  • complementsReflexive Metacognitive Agent·Agent maintains an explicit self-model of its own capabilities, confidence and limitations, and reasons over that model when accepting / refusing / handing off tasks.
  • complementsHuman-in-the-Loop★★Require explicit human approval at defined points before the agent performs an action.
  • complementsHuman ReflectionReflection loop that explicitly collects human feedback (not approval) on agent plans to improve them, distinct from approval gates where the human only says yes/no.

Neighbourhood

Click any neighbour to follow the language. Scroll to zoom, drag to pan.

References

Provenance