Retrieval & RAG

Table-Augmented Generation

Answer a natural-language question over a database in three stages — synthesise an executable query, run it against the data layer with model calls embedded in execution, then generate the answer from the result.

Problem

A question such as 'which of last quarter's outage incidents were caused by a vendor misconfiguration' cannot be expressed as pure relational algebra, because deciding whether a free-text postmortem describes a vendor misconfiguration is a semantic judgement, not a column comparison. Pushing the whole table into a prompt does not scale, and a single SQL query cannot reason over the prose. The system needs to combine exact, scalable computation over the rows with per-row semantic reasoning, without loading the table into context or flattening the question into a lookup.

Solution

Decompose answering into three stages over a database. In query synthesis the model translates the natural-language question into an executable query — typically SQL extended with calls back to a language model, exposed as user-defined functions or semantic operators — so a clause like a relevance filter or a free-text classification becomes part of the plan rather than something done afterward in a prompt. In query execution the data engine runs that query: exact relational work (joins, filters, aggregation) stays in the engine where it scales, and the embedded model calls are evaluated per row or per group only where semantic judgement is required, so reasoning is pushed into the data layer instead of pulling the table into context. In answer generation the model reads the compact, computed result set and produces the response grounded in those rows. Text2SQL is the special case where synthesis emits pure relational algebra and execution needs no model calls; retrieval is the special case where the query is a point lookup of a few records — both fall out of the same paradigm.

When to use

Questions over structured data need semantic judgement on the rows that relational algebra cannot express, such as classifying or ranking by free-text content.
The table is too large to dump into context, so exact computation must stay in the data engine and the model must be invoked only where judgement is required.
An execution layer is available that can embed model calls into query execution (semantic operators or user-defined functions).

Open the full interactive page →

Diagram, neighbourhood map, code examples, related patterns and full provenance.

Problem

Solution

When to use

Related