Mandatory Red-Flag Escalation
Maintain a deterministic set of high-risk triggers so that on any match the agent immediately aborts its workflow and hands off to a human, without weighing whether to escalate.
Problem
When a high-stakes signal appears, an agent left to its own judgement tends to keep doing what it was doing: it asks one more clarifying question, finishes the current step, or scores the situation against a soft threshold before deciding to escalate. Each of those extra turns is a chance to mishandle an emergency, and a model that treats escalation as one option among many will sometimes rationalise staying in the loop. The system needs a guarantee that a matched red flag stops normal handling immediately, not a tendency that usually does.
Solution
Keep red-flag triggers as a maintained, deterministic rule set — keyword and phrase matches, classifier thresholds, or structured-field conditions — evaluated against every turn before the normal workflow runs. The check sits outside the model's discretion: a match fires regardless of what the agent was planning. On a match the runtime short-circuits the remaining workflow, stops any further autonomous handling, and performs a warm handoff that passes the human operator the transcript and any structured context already collected. The model never votes on whether to escalate; its only role after a match is to relay the brief, fixed escalation message while control transfers. The trigger set is versioned and reviewed against the underlying protocol so it stays neither too noisy nor too narrow.
When to use
- A small set of inputs signal life-, safety-, or money-critical situations where continuing the normal workflow at all is unsafe.
- The high-risk signals can be expressed as deterministic, auditable triggers checked on every turn.
- A human or specialised path exists that must take over immediately on a match.
Open the full interactive page →
Diagram, neighbourhood map, code examples, related patterns and full provenance.