Safety & Control

Human-in-the-Loop

Require explicit human approval at defined points before the agent performs an action.

Problem

If the agent acts fully autonomously across all action classes, then any moment of model overconfidence becomes a real-world incident: a typo-squatted vendor gets paid, the wrong customer gets emailed, the production database loses a table. If the agent gates every action behind human approval, users get approval-fatigued, start clicking through prompts without reading them, and the gating stops protecting anyone. Without a way to single out the small set of action classes that genuinely warrant a pause, the team has to choose between unsafe autonomy and unusable friction.

Solution

Identify the boundary. Pause the loop. Surface the proposed action with enough context for the human to decide. Require an explicit approve/reject. Resume on approve; abort or replan on reject. Log the decision.

When to use

  • Action consequences at a defined boundary are too costly to leave to the model alone.
  • A human reviewer is reachable within the latency budget the workflow allows.
  • Approve, reject, and resume semantics can be expressed cleanly in the agent loop.

Open the full interactive page

Diagram, neighbourhood map, code examples, related patterns and full provenance.

Related