III · Tool Use & EnvironmentEmerging

Skill Library

also known as Tool-Creating Agent, Meta-Tool Use, Self-Authored Tools

Let the agent grow its own toolkit by writing reusable skills that subsequent runs can call.

Context

A team operates a long-running agent that handles recurring task shapes — weekly competitor reports, periodic data cleans, repeating customer-onboarding workflows. The same scrape-clean-summarise pipeline gets re-derived from first principles every run, and the runtime supports loading new code modules without restarting the agent.

Problem

Without a place to crystallise repeated work into reusable artefacts, every run pays the full cost of working the routine out again, including the cost of the model's wrong turns along the way. The team has no way to review or remove a routine once it exists in the model's habits, because the only place it ever lived was the model's working memory for that session.

Forces

  • New skills can be wrong or unsafe.
  • The library must be loadable without restart in a long-running agent.
  • Skill discovery (which skill applies?) is itself a retrieval problem.

Example

An agent that fetches similar reports every week keeps re-deriving the same scrape-clean-summarise pipeline from scratch. The team gives it a `skills/` directory: when the agent finishes a recurring task it can write a small reusable module (with a critic gating the addition); subsequent runs import and call it directly. Over a few months the agent crystallises a library of named skills for the domain and recurring tasks complete in a fraction of the original turns.

Diagram

Solution

Therefore:

A directory (often `skills/*.py` or `skills/*.md`) where the agent can write new modules. A loader (importlib in Python, dynamic import in JS) makes them callable. A critic gates additions. Old skills are versioned, not overwritten silently.

What this pattern forbids. New skills enter the library only after passing the critic; they cannot mutate existing skills without quorum.

The smaller patterns that complete this one —

  • usesSelf-Modification Diff Gate·Gate the agent's edits to its own code or rules through a separate critic persona that reviews the diff before it lands.

And the patterns that stand alongside it, or against it —

  • composes-withCode Execution★★Let the model emit code, run it in a sandbox, and treat the run as the answer instead of trusting the model to compute in its head.
  • complementsExploration vs ExploitationBalance taking the best-known action (exploit) with trying alternatives that might be better (explore).
  • alternative-toAgent SkillsPackage author-time procedures (markdown + optional resources) the agent loads on demand for specific task types.
  • complementsApp Exploration Phase·Before deploying an agent against an opaque app, have it explore (or watch a human demonstrate) the app, generating a per-element documentation knowledge base; at deployment, retrieve element docs to ground actions.
  • complementsWebAssembly Skill Runtime·Package each agent skill as a WebAssembly module with a capability manifest, and run it inside a Wasm runtime that enforces those capabilities, so untrusted skills cannot weaken the host's sandbox.
  • complementsTool/Agent RegistryMaintain a single queryable catalogue of both available tools and available agents, with metadata (capability, cost, latency, quality) the agent can use to pick the right one for a task.

Neighbourhood

Click any neighbour to follow the language. Scroll to zoom, drag to pan.

Used in frameworks

References

Provenance