VIII · Safety & ControlEmerging

Policy-as-Code Gate

also known as OPA Action Gate, Compiled Governance, Policy-as-Prompt, Rego-Gated Agent, External Policy Engine

Evaluate every proposed agent action against externally-managed machine-readable policies before dispatch, so compliance authorship lives outside the prompt and outside the agent code.

This pattern helps complete certain larger patterns —

  • used-byRigor RelocationRelocate verification rigor from the model loop to surrounding scaffolding (evals, judges, decision logs, policy gates) so failures are caught by the wrapper rather than the agent.

Context

A team runs an agent in a regulated or compliance-sensitive domain — banking, insurance, public-sector, critical infrastructure — where the set of permitted actions is determined by policy documents that compliance, legal, or security functions own and update. The agent has a non-trivial action surface (transfers, account changes, external API calls of varying risk) and the rules over that surface change more often than the agent code. The people who write the rules are not the same people who write the prompts or deploy the agent.

Problem

When the governance rules live inside the system prompt or are hard-coded in the agent, every policy change becomes a prompt edit followed by a redeploy, and the compliance officers responsible for the rules cannot read, audit, or change them without going through engineering. Natural-language rules embedded in the prompt also have no signed version, no machine-evaluable contract with the action that actually fired, and no independent audit trail an auditor can replay. Without an external, machine-readable policy surface, compliance and engineering are bound to the same release cycle and the rules become unauditable.

Forces

  • Compliance officers must own the rules, but they do not write prompts and do not deploy agent code.
  • Policies change faster than agent prompts and on a different release cadence than model weights.
  • Natural-language rules embedded in the prompt are not independently auditable and have no signed version.
  • A machine-evaluable policy engine must be deterministic and fast enough to sit on the hot path of every tool call.
  • Policy documents are often authored in prose; manually translating them to code is a bottleneck and a source of drift.

Example

A bank deploys an agent that can move money, open accounts, and call external KYC services. The compliance team writes its rules in Rego in a separately versioned policy repository, including jurisdiction-by-jurisdiction holds, sanctions checks, and threshold-based human-approval requirements. Before any tool call, the agent serialises the proposed action and sends it to an OPA sidecar. OPA returns allow with obligations (require dual approval, mask the customer name in the downstream call), and the agent honours those obligations on dispatch. When a regulator asks why a particular transfer was permitted, the audit log replays the action against the exact policy hash that was active at that moment.

Diagram

Solution

Therefore:

Maintain policies as code (OPA/Rego, Cedar, or equivalent) in a repository owned by compliance, optionally generated by a policy compiler that translates prose policy documents into the rule language. Before any tool dispatch, the agent emits a structured action proposal (tool, arguments, caller context, retrieved data fingerprints) to an external policy decision point. The engine returns allow, deny, or allow-with-obligations together with a policy hash and rule id. The agent dispatches the tool only on allow; on deny the agent surfaces the rule id to the user or escalates. Policies are versioned, signed, and ship through a separate pipeline from the agent. Evaluation results are logged with the policy hash so any decision can be re-checked against the exact rule version that fired.

What this pattern forbids. The LLM must not dispatch any governed tool call without first obtaining an allow verdict from the external policy engine, must not modify or paraphrase rule content at runtime, and must surface the rule id behind any deny rather than synthesising its own explanation.

The smaller patterns that complete this one —

  • generalisesPolicy-Gated Agent Action (KRITIS)Each agent action passes through a policy gate (NIS2, EU the agent Act, BSI rules) and is tagged with Run ID + Model Digest + Policy Hash for WORM-audit reconstruction.

And the patterns that stand alongside it, or against it —

  • alternative-toConstitutional CharterDefine rules the agent reads every turn but cannot modify, encoding inviolable boundaries.
  • complementsInput/Output Guardrails★★Validate inputs before they reach the model and outputs before they reach the user.
  • complementsHuman-in-the-Loop★★Require explicit human approval at defined points before the agent performs an action.
  • complementsRefusal★★Explicitly refuse requests that fall outside the agent's scope, capability, or policy boundaries.
  • complementsVisual Workflow Graph★★Express agentic logic as a visual graph of typed nodes connected on a canvas with Start and End nodes so non-coding stakeholders can read and edit the flow.
  • complementsTyped Refusal CodesDefine a single source of truth for machine-readable refusal codes across all guard surfaces, so refusals can be triaged mechanically rather than by string-grepping ad-hoc human-readable messages.
  • complementsLLM as Periphery·Invert the typical LLM-in-the-middle architecture: a deterministic state machine and event store form the core; the LLM is restricted to edge tasks — input interpretation and output synthesis only.
  • complementsSimulate Before ActuateBefore issuing an irreversible action, run a deterministic simulation that computes pre-conditions, invariants, and expected deltas; require a verifier — automated or human — to green-light the simulated outcome before the real command is sent.
  • complementsHybrid Symbolic-Neural RoutingPer query, route between a symbolic path (rule engine, knowledge graph) and a neural path (LLM), using the LLM for interpretation and the symbolic layer for exact constraints.
  • complementsControl-Flow IntegrityTreat the agent's planned step sequence as a trusted control-flow graph that tool outputs, retrieved content, and user-supplied data cannot redirect at runtime.
  • complementsStochastic-Deterministic Boundary (SDB)Formalize the seam between an LLM proposal and a system action as a four-part contract — proposer, verifier, commit step, reject signal — so the contract itself, not the agent's good intent, gates side-effects.
  • complementsSupervisor-Plus-GateSupervisor controller that validates and gates LLM outputs against deterministic checks before they commit to side-effects.
  • complementsTool Over-Broad ScopeAnti-pattern: grant the agent tools scoped so broadly that a single hallucinated argument can escalate into a privilege incident.
  • complementsDecision Context MapsBefore any consequential decision, require the agent to gather a declared set of contextual inputs (resource availability, schedules, downstream dependencies) into a 'context map' the decision must cite.
  • alternative-toContext Gap (Security)Agents faithfully follow explicit security rules but miss the broader implications — they log access correctly without flagging the unusual pattern a human expert would catch immediately.
  • complementsPriority Matrix (Conflict Resolution)Pre-define how the agent must resolve specific classes of goal conflicts via a human-authored lookup table — transforming the agent from a decision-maker (where it fails on competing objectives) into a decision-implementer.
  • composes-withAgent Middleware ChainWrap every model call, tool call, and memory access in a composable pre/execute/post interceptor pipeline so cross-cutting concerns attach without touching agent or orchestrator code.
  • composes-withMulti-Principal Welfare Aggregation·When an agent serves multiple humans with conflicting preferences, declare the aggregation rule explicitly rather than letting it be implicit in the prompt or fine-tune.
  • composes-withCost-Aware Action DelegationClassify every agent action by risk/cost and route each tier to a different approval policy, bounding the autonomy surface per-action instead of by one global flag.
  • complementsAgentic Golden PathConstrain an agent to the platform's curated golden path of living, machine-readable standards and check for drift as it works, so its output is compliant by construction rather than corrected later.

Neighbourhood

Click any neighbour to follow the language. Scroll to zoom, drag to pan.