Methodology · Iteration Management

Crawl-Walk-Run Automation Gating

Separate what an agent can do from what it is allowed to do on its own. A system that could plausibly act gets to act only after the data earns it, one action type at a time.

Description

Roll an agent out in three stages, with a clear gate between each one. In the first stage the agent only suggests, and a person acts. In the second stage the agent acts on internal staff, who can fix mistakes. In the third stage the agent acts directly on outside customers. Each stage is set per action type, not for the whole agent. The same agent can be in the last stage for safe read-only actions and the first stage for refunds. To move up a stage, an action type must clear a published metric bar. If its numbers drop, it moves back down on its own.

When to apply

Use this for any agent that could plausibly act on its own in ways customers feel. Examples are replying to tickets, refunding orders, sending outbound messages, or changing production resources. The right fit is when one bad action does real harm and there is no reliable way to undo it. Don't apply it for read-only or sandboxed agents, where a bad action causes no harm.

What it involves

  • Enumerate action types
  • Publish the metric bar per tier
  • Start every action type at Crawl
  • Promote one action type at a time
  • Watch for regression and auto-demote
  • Advance to Run with the customer-outcome metric

Open the full interactive page

Diagram, neighbourhood map, code examples, related patterns and full provenance.

Related