Velocity-and-Magnitude Governor

also known as Velocity Governor, Magnitude Governor, Pre-Trade Velocity Control

Hard-code per-unit-time caps on the financial magnitude of agent actions, and on any deviation beyond a statistical threshold force a downgrade from human-on-the-loop to human-in-the-loop.

Context

An agent acts inside a market or money-moving system where each step carries financial weight: it places orders, sizes positions, moves funds, or commits spend. The dangerous variable is not how many calls the agent makes per second but how many dollars it commits per second. A loop bug, a mispriced signal, or a prompt injection can drive the agent to commit a large notional in a window too short for any human to notice, and a healthy call rate can still hide a runaway dollar rate.

Problem

Generic throttles bound the wrong quantity. Capping requests, tokens, or loop iterations leaves dollars-per-second unbounded, so an agent that stays well within its call budget can still place orders far larger or faster than its established baseline before anyone intervenes. Conversely a flat per-action cost ceiling ignores velocity: many small actions in a tight window aggregate into a large exposure that no single action trips. The system needs a control that bounds committed financial magnitude per unit time, recognises when the agent's volume or value departs from its normal envelope, and reacts proportionally rather than only by a full stop.

Forces

A throttle tuned for compute (calls, tokens, steps) does not bound exposure; the same call rate can hide a benign dollar rate or a catastrophic one.
A fixed dollar ceiling per action misses velocity, while a pure velocity counter misses one oversized action; a useful governor must bound both magnitude and rate together.
An out-of-band halt is too blunt for a deviation that is large but plausible, yet leaving a >3-sigma departure to run autonomously is too permissive; the response must be graduated.
Tight caps and an eager downgrade trigger stop runaways but also stall legitimate high-value bursts, so the baseline and threshold must be calibrated, not guessed.

When to use

Agent actions move money or take market positions, so the quantity worth bounding is committed financial magnitude per unit time rather than call or token count.
A stable baseline of normal volume and value exists, so a deviation threshold (for example three sigma) can be calibrated rather than guessed.
A graduated response is wanted: a large-but-plausible deviation should tighten human oversight rather than halt the system outright.

When not to use

Actions carry no financial weight, where bounding compute with rate limiting or a step budget is the right control.
No reliable baseline of normal activity exists, so a statistical-deviation trigger would fire arbitrarily.
The only acceptable response to a breach is a full stop, where a kill switch fits better than a graduated autonomy downgrade.

Example

A trading-desk agent is cleared to place orders on its own while an analyst watches the dashboard. Its strategy hits a feedback loop and starts firing buy orders ten times larger and far faster than its usual pattern. The governor caps each order at the per-second notional limit, rejects the ones over size, and because the burst is more than three sigma above the agent's baseline it downgrades the agent so the analyst must approve every further order before it reaches the market.

Diagram

flowchart TD A[Agent proposes money-moving action] --> G{Governor} G -->|over magnitude or velocity cap| R[Reject before execution] G -->|deviation > 3-sigma vs baseline| D[Downgrade: human-on-the-loop -> human-in-the-loop] D --> H[Hold for human approval] G -->|within caps and within baseline| E[Execute, human-on-the-loop]

Solution

Therefore:

Borrow the pre-trade control of high-frequency trading and place a governor in the action path that every money-moving step must clear. The governor maintains hard caps on financial magnitude per unit time across several horizons (notional per second, order or transaction size, cumulative position, transactions per window) and a rolling baseline of normal volume and value. Each pending action is checked against the caps; an action that would breach a cap is rejected before it executes. In parallel the governor scores how far the current volume or value departs from baseline, and when that deviation exceeds a statistical threshold (for example more than three sigma) it forces an autonomy downgrade: the agent moves from human-on-the-loop, where a human watches but rarely intervenes, to human-in-the-loop, where every further material action waits for affirmative human approval. The downgrade is graduated and automatic, distinct from a full halt, and the baseline plus threshold are calibrated from historical activity rather than assumed.

What it gives you

Bounds dollars-per-second, not just calls-per-second, so a runaway that stays within its compute budget still cannot commit an unbounded notional.
Couples magnitude and velocity, catching both one oversized action and many small actions that aggregate within a tight window.
Responds proportionally: a large-but-plausible deviation tightens human oversight rather than halting the system, preserving availability while removing the agent's unsupervised authority.

What it costs you

A baseline or threshold set too tight stalls legitimate high-value bursts and trains operators to wave approvals through.
A baseline learned from a quiet period under-bounds a volatile one, so the caps and sigma threshold need recalibration as conditions shift.
An attacker who can drift the baseline slowly (for example by ramping volume over days) can raise the ceiling before the harmful burst.

What this pattern forbids. An action whose committed financial magnitude or velocity would exceed a per-unit-time cap cannot execute; and once a deviation crosses the statistical threshold the agent must not perform further material actions autonomously, only with explicit human-in-the-loop approval.

And the patterns that stand alongside it, or against it —

alternative-toRate Limiting★★— Cap the number of requests, tokens, or tool calls per user (or session) within a time window.
complementsCost Gating★★— Block actions whose expected cost exceeds a threshold without explicit user (or operator) acknowledgement.
complementsAutonomy Slider★— Expose agent autonomy as a continuous adjustable parameter so the same codebase can span scripted assistant to fully autonomous worker without re-architecting.
complementsKill Switch★— Provide an out-of-band control plane to halt running agent instances without redeploy.