VIII · Safety & ControlEmerging

Kill Switch

also known as Out-of-Band Stop, Emergency Halt, Killbit, Halt All Agents, Stop Every Running Agent

Provide an out-of-band control plane to halt running agent instances without redeploy.

Context

A team runs production agents that the operator may suddenly need to stop — a PII leak was discovered, the agent is hammering a third-party API after a cease-and-desist, a runaway cost spike just tripped an alarm, or a mass-action error is unfolding across customer accounts. Stopping has to happen now, not at the end of the current step, and it has to apply to every running instance regardless of which tool it is in the middle of.

Problem

An in-band stop hook that the agent's own loop checks at the start of each iteration only works if the agent's loop is still alive and cooperating. If the model is wedged inside a long tool call, infinite-looping on a degenerate state, or running tools that ignore process signals, the in-band stop never fires. Killing the operating-system process is a brutal fallback that loses provenance and any chance to run compensating actions. Without a stop primitive outside the agent's own control flow, operator authority disappears the moment the agent stops checking in.

Forces

  • False trips lose user work.
  • Out-of-band signals must propagate to all agent surfaces (model calls, tools, sub-agents).
  • Compensating actions on halt are non-trivial.

Example

An autonomous trading-research agent is running a multi-hour backtest loop when ops notices it is hammering a third-party data API that just sent a cease-and-desist email. The in-band stop hook is checked by the agent's own loop and the agent is wedged on a long tool call. The team adds an out-of-band kill-switch: a signed revocation token in a shared store that the runtime, not the agent, checks before every step and tool call. Flip the token and every running instance halts within one step. The OS-kill fallback is only there for true emergencies.

Diagram

Solution

Therefore:

Signed revocation token or feature flag checked on every step from a shared store the agent runtime cannot bypass. On revocation, the agent halts: no further model calls, no further tool calls; in-flight effects are compensated where possible. Killing the OS process is the fallback, but loses provenance.

What this pattern forbids. When the kill-switch fires, no further model or tool calls may proceed regardless of agent state.

The smaller patterns that complete this one —

  • usesCompensating Action★★Pair every irreversible-looking agent action with a compensating action that can undo or counteract it.

And the patterns that stand alongside it, or against it —

  • complementsStop Hook★★Define an explicit programmatic predicate that decides when the agent's loop should terminate.
  • composes-withCircuit Breaker★★Stop calling a failing dependency for a cooldown period after error rates exceed a threshold.
  • complementsRate Limiting★★Cap the number of requests, tokens, or tool calls per user (or session) within a time window.
  • composes-withSandbox Escape MonitoringTreat sandbox boundary violations as telemetry; alert on syscalls, network egress, or filesystem writes outside expected scope.
  • alternative-toUnbounded Subagent SpawnAnti-pattern: a supervisor or orchestrator spawns sub-agents that can themselves spawn sub-agents without a global cap.
  • complementsSimulate Before ActuateBefore issuing an irreversible action, run a deterministic simulation that computes pre-conditions, invariants, and expected deltas; require a verifier — automated or human — to green-light the simulated outcome before the real command is sent.
  • complementsAgent Middleware ChainWrap every model call, tool call, and memory access in a composable pre/execute/post interceptor pipeline so cross-cutting concerns attach without touching agent or orchestrator code.
  • complementsAutonomy SliderExpose agent autonomy as a continuous adjustable parameter so the same codebase can span scripted assistant to fully autonomous worker without re-architecting.
  • complementsComposable Termination ConditionsExpress agent stop criteria as small single-purpose conditions composed with AND/OR into one explicit termination contract instead of ad-hoc loop guards.
  • complementsCorrigible Off-Switch Incentive·Design the agent so being shut down or overridden by a human carries positive expected value, because the human's intervention is itself evidence the current objective is mis-specified.
  • complementsInterruptible Agent ExecutionTreat pause, resume, and cancel as a first-class control surface on every long-running agent so users can halt expensive or off-track trajectories mid-task while state is preserved for resumption.

Neighbourhood

Click any neighbour to follow the language. Scroll to zoom, drag to pan.