Safety & Control

Calibrated Help-Gate via Conformal Prediction

Use conformal prediction to form a calibrated set of candidate actions and have the agent ask a human for help only when that set is not a singleton, giving a statistical task-completion guarantee.

Problem

Deciding when an agent should defer to a human is usually done with an uncalibrated confidence number, which gives no guarantee about how often the agent will be wrong when it proceeds. Set the bar too high and the human is flooded with needless questions; too low and the agent confidently acts on instructions it has misunderstood. The agent needs a principled, tunable rule for when to ask that comes with a real guarantee on task success.

Solution

Collect a calibration set of scored decisions and pick a target success level. At run time the planner emits candidate next actions with scores; conformal prediction turns those scores into a prediction set sized so that, at the chosen coverage, the correct action is inside it. If the set contains exactly one action the agent acts autonomously; if it contains more than one, or none, the agent is uncertain and asks the human to choose. The coverage level tunes the trade-off, and the calibration guarantees the task-completion rate rather than relying on the model's self-assessment.

When to use

  • The agent must decide step-by-step whether to act or defer, and acting wrongly is costly.
  • A held-out calibration set and a target success level are available.
  • The planner can emit candidate actions with scores that conformal prediction can turn into a set.

Open the full interactive page

Diagram, neighbourhood map, code examples, related patterns and full provenance.

Related