Spec-First Agent
also known as Specification-Driven Agent, Plan-as-Document
Drive the agent loop from a human-authored specification document rather than free-form prompts.
This pattern helps complete certain larger patterns —
- used-bySpec-Driven Loop★— Run the same prompt against a fixed spec in a deterministic outer loop until the spec is satisfied.
Context
A team runs an agent on a task that is well-defined enough to write down — a recurring report, a bug-fix list, a migration plan, a multi-step automation. The team wants the agent's instructions to live in a file that humans can read, review, and edit alongside the code, rather than in a chat history or someone's head. Reviewers should be able to diff changes to the agent's intent the same way they diff changes to the source code.
Problem
Free-form prompts drift between sessions: the same engineer types subtly different instructions on different days and the agent's behaviour quietly changes. When the spec lives in one engineer's head, nobody else can review it, audit it, or take over when that engineer is away. Without a written target, there is no single source of truth for what "done" means, so the agent may declare success on partial work or keep going past where the team would have stopped. The team needs a written, version-controlled spec without giving up the agent's ability to update its own plan as it learns.
Forces
- Spec authoring is up-front work.
- The agent must update the spec when learnings invalidate it; uncontrolled spec mutation is dangerous.
- Spec format must be both human- and agent-readable.
Example
A small team has one engineer who knows the agent's behaviour by heart but the spec lives in their head and is unaudited. They write PROMPT.md as the agent's spec, the agent reads it each iteration and may update it under controlled conditions. New engineers read the markdown to understand intent; reviewers diff spec changes; behaviour drift becomes visible because it shows up as a spec edit rather than a silent prompt change.
Diagram
Solution
Therefore:
Write the specification as a markdown file (PROMPT.md, fix_plan.md, or similar). The agent reads the spec at each iteration, executes against it, and may update it under controlled conditions. The spec is the single source of truth for what 'done' means.
What this pattern forbids. The agent acts only against goals named in the spec; out-of-scope work must be added to the spec first.
And the patterns that stand alongside it, or against it —
- complementsAgent Skills★— Package author-time procedures (markdown + optional resources) the agent loads on demand for specific task types.
- complementsSOP-Encoded Multi-Agent Workflow★— Encode a human Standard Operating Procedure (roles, ordered phases, standardised hand-off artefacts) into a multi-agent pipeline so that agents communicate through structured documents rather than free-form chat.
- alternative-toTodo-List-Driven Autonomous Agent★— Have the agent author a plan file (e.g. todo.md) early in the run, tick items as it completes them, and re-inject the remaining plan into context; the file is durable plan and working memory.
- alternative-toAutomatic Workflow Search·— Treat the agent's workflow (a graph of LLM-invoking nodes) as an artefact to search; use Monte Carlo Tree Search guided by an eval benchmark to discover the best workflow, then deploy it.
- complementsPlanner-Generator-Evaluator Harness·— Decompose a long-running job into three role-isolated agents — a Planner emitting a feature list, a Generator working one chunk per fresh context, and an Evaluator grading against a rubric without seeing the Generator's trace.
- alternative-toVisual Workflow Graph★★— Express agentic logic as a visual graph of typed nodes connected on a canvas with Start and End nodes so non-coding stakeholders can read and edit the flow.
- composes-withPre-Flight Spec Authoring★— Before any code is generated, author a multi-pillar spec and have the agent critique it for ambiguity and edge cases, so that the loop executes against a reviewed target rather than a fresh prompt.
- complementsRigor Relocation★— Relocate verification rigor from the model loop to surrounding scaffolding (evals, judges, decision logs, policy gates) so failures are caught by the wrapper rather than the agent.
Neighbourhood
Click any neighbour to follow the language. Scroll to zoom, drag to pan.