Augmented LLM
Build the foundational agent block as an LLM augmented with retrieval, tools, and memory that the model actively chooses to use, rather than a bare-model call.
Problem
A bare large language model call cannot look up fresh facts, change state in any external system, or remember anything between turns. If each higher-level pattern wires up retrieval, tool calling, and memory in its own ad-hoc way, the building blocks stop being interoperable: a routing layer cannot drop in a worker that was built against a different memory shape, and observability has to be re-implemented per integration.
Solution
Wire the model with three capabilities and expose each via a model-driven interface: (1) retrieval queries the model can issue against external corpora; (2) tool calls the model can emit and whose results stream back; (3) memory the model can read from and write to across turns. The model — not the surrounding code — decides which augmentation to invoke at each step. Other workflow patterns (prompt-chaining, routing, orchestrator-workers, etc.) compose instances of this block, not bare model calls.
When to use
- You are building any agent system and need a consistent building block.
- The model should decide when to retrieve, call tools, or use memory — not surrounding code.
- Higher-level workflows (chaining, routing, orchestration) need a uniform unit to compose.
Open the full interactive page →
Diagram, neighbourhood map, code examples, related patterns and full provenance.