Verification & Reflection

Darwin-Gödel Self-Rewrite

An agent rewrites its own source code, archives every successful variant, and samples mutation parents from the archive rather than the latest version, using archive diversity as stepping-stones to escape local optima.

Problem

When the agent always mutates the latest accepted version (greedy self-rewrite), it climbs whatever local hill it started on and stops. The move that would unlock a higher ridge is several mutations away from anything that currently scores well, so a strictly score-maximising selection rule will never reach it. Throwing away the variants that scored worse destroys the very diversity that would have been the bridge to a better region of the search space. The agent gets stuck in a local optimum, and without some way of preserving and revisiting worse-scoring stepping-stones it has no path out short of a manual reset.

Solution

The agent maintains a versioned archive of self-modifications. Each generation: (1) sample a parent variant from the archive using a diversity-aware policy (not strictly the current best); (2) propose a code or prompt mutation; (3) run the mutated variant through a viability gate (compiles, passes safety checks, runs end-to-end on a smoke test); (4) score it on the objective; (5) if viable, add it to the archive with its score and lineage. Selection from the archive is the key move — it lets a low-scoring but novel variant become the parent of a future high-scoring variant. The archive is bounded by a retention policy that favours diversity over raw score so stepping-stones are preserved.

When to use

  • The agent can rewrite its own implementation (code, prompt, scaffolding) safely.
  • A clear objective score is available per variant.
  • Greedy self-rewrite has empirically plateaued.

Open the full interactive page

Diagram, neighbourhood map, code examples, related patterns and full provenance.

Related