VII · Verification & ReflectionExperimental·

Darwin-Gödel Self-Rewrite

also known as DGM, Darwin-Gödel Machine, Archive-Sampled Self-Mutation, Stepping-Stone Self-Rewrite

An agent rewrites its own source code, archives every successful variant, and samples mutation parents from the archive rather than the latest version, using archive diversity as stepping-stones to escape local optima.

Context

A research team builds an agent that can read and rewrite parts of its own implementation, such as its system prompt, its tool definitions, the scaffolding around its main loop, or the code that implements it. The team has a clear way to measure whether one version of the agent is better than another: a benchmark, a task suite, or an automated self-evaluation that returns a score per variant. The point of the project is to let the agent improve itself over many generations without human-in-the-loop edits.

Problem

When the agent always mutates the latest accepted version (greedy self-rewrite), it climbs whatever local hill it started on and stops. The move that would unlock a higher ridge is several mutations away from anything that currently scores well, so a strictly score-maximising selection rule will never reach it. Throwing away the variants that scored worse destroys the very diversity that would have been the bridge to a better region of the search space. The agent gets stuck in a local optimum, and without some way of preserving and revisiting worse-scoring stepping-stones it has no path out short of a manual reset.

Forces

  • Greedy ascent from the latest variant converges to local optima quickly.
  • Useful stepping-stone variants often score worse short-term than the current best.
  • Throwing away history makes those stepping-stones permanently unreachable.
  • Self-modification needs a safety gate so each variant is at least viable before it enters the archive.
  • Archive growth must be bounded or sampling becomes diffuse and useless.

Example

A research agent rewrites its own coding scaffolding to maximise a benchmark score. The greedy version stalls at a plateau after twenty generations. Switching to an archive-sampled scheme, a worse-scoring variant from generation six becomes the parent for generation twenty-two; its odd tool-handling structure happens to combine well with a mutation that the greedy line never reached, and the score jumps. The archive stored that stepping-stone for sixteen generations before it paid off.

Diagram

Solution

Therefore:

The agent maintains a versioned archive of self-modifications. Each generation: (1) sample a parent variant from the archive using a diversity-aware policy (not strictly the current best); (2) propose a code or prompt mutation; (3) run the mutated variant through a viability gate (compiles, passes safety checks, runs end-to-end on a smoke test); (4) score it on the objective; (5) if viable, add it to the archive with its score and lineage. Selection from the archive is the key move — it lets a low-scoring but novel variant become the parent of a future high-scoring variant. The archive is bounded by a retention policy that favours diversity over raw score so stepping-stones are preserved.

What this pattern forbids. Each proposed variant must pass the viability gate (compiles, safety-checks, smoke test) before entering the archive; the agent must not mutate or sample outside the archive; the archive must keep score and lineage for every variant and must not be silently pruned by score alone.

And the patterns that stand alongside it, or against it —

  • alternative-toSelf-Refine★★Iterate generate → feedback (same model) → refine until a stop criterion fires, with no separate critic model.
  • alternative-toReflexion·Have the agent write linguistic lessons from past failures and consult them in future episodes.
  • complementsSelf-Modification Diff Gate·Gate the agent's edits to its own code or rules through a separate critic persona that reviews the diff before it lands.
  • complementsEvaluator-Optimizer★★One LLM generates; another evaluates and feeds back; loop until criteria are met.

Neighbourhood

Click any neighbour to follow the language. Scroll to zoom, drag to pan.