Recursive Language Model
Treat an over-long prompt as an environment the model navigates by code, letting it partition and recursively call itself over snippets, so it answers over inputs far larger than its context window.
Problem
Truncation and naive chunking drop information the answer may depend on, and even when a long input fits, model accuracy falls as the prompt grows. Fixed map-reduce scaffolds impose one decomposition the model cannot adapt: they split the input the same way regardless of the question and lose cross-chunk structure. Compaction and summarization throw away detail before the model has decided what matters. The team needs the model itself to decide how to break the input down and to look only at the parts each sub-question needs.
Solution
Place the long input in an environment the model can manipulate programmatically — for example a variable in a code interpreter — instead of pasting it into the prompt. The root model writes code to peek at, search, and partition the input, and spawns recursive calls to itself or a smaller sub-model over the snippets it selects, combining their results. Because the model decides at runtime how to grep, slice, and recurse, the decomposition adapts to the question, and only the relevant snippets ever enter any single call. Inputs orders of magnitude larger than the context window are handled at cost comparable to long-context scaffolds.
When to use
- The input is larger than the context window or large enough to degrade accuracy.
- The right decomposition depends on the question and cannot be fixed in advance.
- A code or REPL environment is available to hold and manipulate the input.
- Comparable-cost handling of huge inputs is worth added latency and complexity.
Open the full interactive page →
Diagram, neighbourhood map, code examples, related patterns and full provenance.