Retrieval & RAG

Vectorless Reasoning-Based Retrieval

Retrieve by having the model reason its way down a document's own table-of-contents tree to the relevant sections, instead of embedding chunks and ranking them by vector similarity.

Problem

Vector similarity is a proxy for relevance, and on long professional documents the proxy breaks down: the passage that repeats the query's words is often not the passage that answers it, while the passage that does answer it shares little surface vocabulary. Fixed-size chunking compounds the mismatch by severing the structure the document relied on to make sense — a number is separated from the line item it belongs to, a clause from the term it defines. The retrieved context is also opaque: the system returns vector hits with no account of why this span and not another, so an analyst cannot audit the retrieval and the generator inherits whatever the embedding happened to rank highest.

Solution

At index time, parse the document into a tree that mirrors its natural structure — parts, sections, subsections — and write a short summary at each node, keeping the leaf text intact rather than splitting it into fixed-size chunks. No embeddings are computed and no vector store is built. At query time, present the model with the tree as a table of contents and have it judge which branch is most likely to hold the answer, descend into that node, and repeat — a tree search in which the model, not a similarity score, decides each step. The walk ends at the leaf sections the model judges relevant, and retrieval returns those sections together with their page and section identifiers, so every result is traceable to a named location in the source. Compose with a generator that reads the returned sections, and with citation-attribution since the page and section references are already in hand.

When to use

Documents are long and carry a clear, reliable hierarchy of parts, sections, and subsections worth navigating.
The domain is one where vocabulary overlap misleads similarity search — finance, law, regulatory, technical manuals.
Retrieval must be auditable, with each result pointing to a named page and section.
Keeping spans intact — tables, clauses, definitions — matters more than embedding-window economy.

Open the full interactive page →

Diagram, neighbourhood map, code examples, related patterns and full provenance.

Problem

Solution

When to use

Related