Todo-List-Driven Autonomous Agent
also known as todo.md Agent, Persistent Markdown Plan, Externalised Plan File
Have the agent author a plan file (e.g. todo.md) early in the run, tick items as it completes them, and re-inject the remaining plan into context; the file is durable plan and working memory.
This pattern helps complete certain larger patterns —
- specialisesScratchpad★★— Give the agent a writable scratch space for intermediate notes that informs later turns but does not pollute the response.
Context
A team runs an agent on a long-horizon autonomous job — a multi-hour coding task, a deep research investigation, a complex data migration — inside a sandboxed virtual machine that gives it persistent file-system access and basic tools (shell, browser, file editor). The run may span hundreds of tool calls, more than any one model context window can comfortably hold. The team needs the agent's plan to survive context truncation and process restarts.
Problem
If the plan lives only in the model's context window, it drifts toward the middle of the window where attention is weakest and the model loses track of which items it has finished. When the context is truncated to fit, the plan is the first thing to disappear because the model has moved past it. If the run is paused, crashed, or resumed in a fresh context, the agent has no durable record of which sub-tasks are done and starts over or skips items at random. Keeping the plan only in the model's head is incompatible with runs longer than a single window.
Forces
- Models attend most strongly to the end (and start) of the context window.
- File-system memory is durable; in-context memory is volatile.
- Re-injecting the full plan every turn is repetitive but combats attention drift.
- Markdown is human- and model-readable, supports easy ticking.
Example
A long autonomous coding run gets context-truncated halfway through and the agent forgets which sub-tasks are done. The team gives it a `todo.md` it must author early in the run as a checklist; each turn it reads the file, works the next unticked item, updates the file, and re-injects the remaining plan into context. Now a context truncation or a process restart can resume cleanly because durable plan and working memory live on disk, not in the window.
Diagram
Solution
Therefore:
Early in the run, the agent writes its plan as a checklist file (todo.md) in its sandbox. Each turn: read the file, work the next unticked item, update the file (tick the item, add follow-ups, drop dead-ends). Re-inject the unticked tail of the file into the prompt before the model's next turn. The file outlives any single context window. Paired with a sandboxed VM that gives the agent persistent storage and basic tools (browser, shell, file editor).
What this pattern forbids. The agent may not advance past an unticked item without recording the action in the plan file; arbitrary in-context-only plans are forbidden.
The smaller patterns that complete this one —
- usesContext Window Packing★★— Choose what fits in the context window each turn given a fixed token budget.
- usesSandbox Isolation★★— Run agent-emitted code or actions in a contained environment with restricted filesystem, network, and process privileges.
And the patterns that stand alongside it, or against it —
- alternative-toSpec-First Agent★— Drive the agent loop from a human-authored specification document rather than free-form prompts.
- complementsAgent Resumption★★— Persist agent execution state so a long-running run survives restarts, deploys, or user disconnects.
- complementsAppend-Only Thought Stream★— Make the agent's thought log append-only so the agent cannot rewrite its own history.
- complementsAffect-Coupled Plan Lifecycle·— Wire small bounded affect bumps to plan-step lifecycle events and accumulate age-bucketed stale-pain on untouched plans so plans gain felt stakes without hard deadlines.
- alternative-toCommitment Tracking·— Extract stated intents from each agent turn into a structured ledger with open / followed-through / expired status, making the gap between promise and follow-through visible and auditable.
- complementsPre-Flight Spec Authoring★— Before any code is generated, author a multi-pillar spec and have the agent critique it for ambiguity and edge cases, so that the loop executes against a reviewed target rather than a fresh prompt.
Neighbourhood
Click any neighbour to follow the language. Scroll to zoom, drag to pan.