Semantic-Layer Query Guardrail

also known as Semantic-Layer SQL Guardrail, Metric-Layer Query Routing, Vetted-Metric Query Guardrail

Route natural-language data questions through a curated semantic layer so the model selects and parameterises vetted metrics and dimensions instead of free-authoring raw SQL against production data.

Context

An agent answers natural-language questions over a production data warehouse — revenue last quarter, active users by region, churn for a cohort. Free-form text-to-SQL is the obvious mechanism, but in practice it rarely survives contact with a real schema: column names are cryptic, business terms are organisation-specific, and the same word means different things to different teams. A separate semantic layer — a metric store such as a dbt Semantic Layer, LookML, or Cube model — already encodes the vetted definitions of each metric and dimension and can compile a metric request into correct SQL on its own.

Problem

Natural language is ambiguous and business terminology is domain-specific, so the model does not know what a particular organisation means by revenue or active user, and a text-to-SQL system that lets the model author raw SQL will silently pick the wrong column, the wrong join, or the wrong filter and return a confidently incorrect number. Letting the model write arbitrary SQL against production also widens the action surface to every table the credential can reach. The system needs the model's language understanding to map the question onto a metric, without granting it the authority to define what that metric is or to author the query that computes it.

Forces

Free-form text-to-SQL covers any question the schema can express, but a single misread column or join yields a wrong answer that looks authoritative and is hard to catch.
Business definitions such as revenue or active user are organisation-specific and change over time; encoding them once in a shared layer is correct, while re-deriving them in each generated query is fragile.
Arbitrary model-authored SQL against production widens the blast radius to every table the credential can read; a fixed catalogue of metrics narrows it.
A semantic layer must be modelled and maintained up front, which is real work and cannot answer a question whose metric nobody has defined yet.

When to use

An agent answers natural-language questions over a production warehouse where business terms such as revenue or active user have organisation-specific definitions the model cannot infer.
A semantic or metric layer (dbt Semantic Layer, Cube, LookML) already encodes vetted metric and dimension definitions, or one is worth building.
Wrong answers from a mis-authored query are costly and consistency of definitions across consumers matters more than answering every possible ad-hoc question.

When not to use

The questions are too varied or exploratory to pre-model, so a fixed metric catalogue would refuse more questions than it answers.
No semantic layer exists and the schema is too small or fast-moving to justify modelling one.
The question genuinely needs per-row semantic judgement over free text that no pre-defined metric captures, where table-augmented generation fits better.

Example

A business-intelligence agent is asked 'what was active-user revenue in EMEA last quarter?'. Instead of writing SQL, it maps the question onto two pre-defined metrics — active_users and revenue — sets the region dimension to EMEA and the time grain to the prior quarter, and emits a structured metric request. The dbt Semantic Layer validates the request, compiles it to the correct SQL using the team's vetted definitions of active user and revenue, runs it, and returns the figures. When someone later asks for a metric nobody has modelled, the agent reports that the metric is undefined rather than improvising a query.

Diagram

flowchart TD Q[Natural-language data question] --> A[Agent maps question to vetted metric + dimensions + params] A --> R[Structured metric request] R --> V{Metric modelled<br/>in semantic layer?} V -- no --> X[Refuse or escalate to extend layer] V -- yes --> C[Semantic layer compiles request to SQL deterministically] C --> E[Execute against warehouse] E --> O[Result returned to agent] A -. forbidden .-> S[Raw SQL authored by model]

Solution

Therefore:

Model the metrics, dimensions, and business definitions once in a semantic layer and expose querying it as the agent's only data tool. The agent's job shrinks from authoring SQL to selecting: it maps the natural-language question onto one or more pre-defined metrics, picks the dimensions to group by, and fills in time grain and filter parameters, emitting a structured metric request rather than a SQL string. The semantic layer validates that request against its schema and deterministically compiles it to SQL — resolving the canonical column, join path, and definition behind each metric — then executes it and returns the result. Because the definitions live in the layer, every consumer computes revenue the same way; because the agent cannot author raw SQL, it cannot reach a table outside the modelled catalogue or invent a definition. A question whose metric is not yet modelled is refused or escalated to extend the layer rather than answered by improvised SQL.

What it gives you

Every answer uses the organisation's vetted definition of each metric, so the same question returns the same number regardless of which agent or consumer asked it.
The model cannot reach a table outside the modelled catalogue or invent a business definition, so the data action surface is bounded to vetted metrics.
SQL compilation is deterministic and owned by the layer, so a correct metric selection cannot be undermined by a model-authored join or filter error.
Metric definitions are maintained in one place, so a change to how revenue is computed propagates to every agent answer without re-prompting.

What it costs you

A question whose metric has not been modelled cannot be answered until someone extends the semantic layer, so coverage lags ad-hoc text-to-SQL.
The semantic layer is upfront modelling and ongoing maintenance work that a small or fast-moving schema may not justify.
Selection is not free of error: the model can still map a question onto the wrong vetted metric or the wrong grouping, returning a precise answer to the wrong question.

What this pattern forbids. The agent may not author or execute raw SQL against production data; it may only select from and parameterise metrics and dimensions defined in the semantic layer, and a question whose metric is not modelled must be refused or escalated rather than answered by improvised SQL.

And the patterns that stand alongside it, or against it —

alternative-toTable-Augmented Generation·— Answer a natural-language question over a database in three stages — synthesise an executable query, run it against the data layer with model calls embedded in execution, then generate the answer from the result.
complementsCanonical-Entity Grounding★— Require the agent to resolve every business identifier it uses — SKU, account, supplier, customer — through an authoritative lookup against the system of record, rather than emitting the identifier from the model's parametric memory.
complementsRisk-Tiered Action Autonomy★— Set an agent's permitted action class by the financial materiality of the action, letting it read and draft freely while requiring a different human principal to release material postings, payments, or filings.
complementsTool Discovery★— Let the agent discover available tools at runtime rather than hardcoding the tool list at agent build time.