Citation Attribution

also known as Source Attribution, Answer-to-Source Binding, Span-Level Citations

Track and surface, alongside a RAG-grounded answer, which retrieved chunks supported which claims, so the binding between answer span and source survives all the way to the user.

Context

A team is shipping a retrieval-augmented system in a compliance, research, or customer-support setting where the user must be able to trace any claim in the answer back to the specific evidence that supports it. Unsupported claims are not an acceptable failure mode; the user needs to click from a sentence in the answer to the exact passage in a source document, and the team needs to be able to defend that link to an auditor.

Problem

Just asking the model to 'include citations' is not enough. Citations that the model writes freely are ungrounded — they look real but may point to documents that were never retrieved or quote text that does not appear in the source. The binding from a span of the answer to a span of evidence has to be created by the retrieval pipeline and carried through generation and delivery; otherwise the citations cannot be trusted, and the whole audit story collapses.

Forces

The chunk-to-claim binding can be at document, chunk, or span level; finer granularity is more useful but harder.
Models given retrieved context may still fabricate citations to documents that were not retrieved.
Span-level alignment requires the model to emit either citation markers or structured outputs that the runtime resolves.
Aggregating citations from multiple chunks behind one claim is common — single-source attribution is too narrow.
Distinct from citation-streaming, which is the delivery shape; this is the binding itself.

Example

A legal-research assistant retrieves case excerpts and must produce an analysis where every claim cites the source case. The team assigns each retrieved chunk a stable `chunk_id` and prompts the model to emit a structured output: a list of claims, each with `text` and `supporting_chunk_ids`. A validator rejects any `chunk_id` not in this turn's retrieval registry. The UI renders each claim with footnote-style links to the cited cases. When the model is uncertain it returns fewer claims rather than fabricating citations; the citation-attribution binding is what the auditor checks.

Diagram

flowchart TD Q[Query] --> R[Retriever] R --> Reg[(Source registry<br/>chunk_id -> chunk)] Reg --> Gen[Generator<br/>emits claims with chunk_ids] Gen --> V{Validator} V -->|all ids known| Ans[Answer with bound citations] V -->|unknown id| Drop[Drop claim / refuse] Ans --> UI[UI renders span-to-source links]

Solution

Therefore:

During retrieval, assign each chunk a stable source-id and keep a registry of which ids were retrieved for this turn. During generation, either (a) prompt the model to emit citation markers (`[src-id]`) at the chosen granularity, then resolve and validate them against the registry, refusing any id that was not retrieved; or (b) use a structured-output schema that has a `claims` array with `text` and `supporting_chunk_ids` fields. At delivery, attach the resolved source records to the answer so the UI can render the binding. Pair with citation-streaming (delivery), naive-rag / contextual-retrieval (the upstream retrieval), and hallucinated-citations (the anti-pattern that ignores binding).

What this pattern forbids. Every claim in the answer must be bound to at least one retrieved-source id from this turn's retrieval registry; citations to ids not in the registry must be rejected before delivery.

The smaller patterns that complete this one —

usesNaive RAG★★— Condition the generator on top-k chunks retrieved from an external dense index so knowledge lives outside parameters.
usesContextual Retrieval★— Prepend a short LLM-generated description to each chunk before embedding so the chunk carries its situating context.

And the patterns that stand alongside it, or against it —

complementsCitation Streaming★★— Stream citations alongside generated text so the UI can render source links in place as content appears.
alternative-toHallucinated Citations✕— Anti-pattern: let the model emit citations as free text and trust them.
complementsStructured Output★★— Constrain the model's output to conform to a JSON Schema (or similar typed shape).
complementsVectorless Reasoning-Based Retrieval·— Retrieve by having the model reason its way down a document's own table-of-contents tree to the relevant sections, instead of embedding chunks and ranking them by vector similarity.
complementsCanonical-Entity Grounding★— Require the agent to resolve every business identifier it uses — SKU, account, supplier, customer — through an authoritative lookup against the system of record, rather than emitting the identifier from the model's parametric memory.
complementsVerify-Before-Cite Resolution Gate★— After generation, resolve every cited authority against an external ground-truth registry and strip or block any citation that does not exist before the answer reaches the reader.

Neighbourhood

Click any neighbour to follow the language. Scroll to zoom, drag to pan.

Used in frameworks

References

Provenance

Source: patterns/citation-attribution.md on GitHub · commit 7965435 · view history
Added to catalog: 2026-05-20
Last updated: 2026-05-21
Contribute: open an issue or PR at github.com/agentpatternscatalog/patterns.