IV · Retrieval & RAGMature★★

Citation Attribution

also known as Source Attribution, Answer-to-Source Binding, Span-Level Citations

Track and surface, alongside a RAG-grounded answer, which retrieved chunks supported which claims, so the binding between answer span and source survives all the way to the user.

Context

A team is shipping a retrieval-augmented system in a compliance, research, or customer-support setting where the user must be able to trace any claim in the answer back to the specific evidence that supports it. Unsupported claims are not an acceptable failure mode; the user needs to click from a sentence in the answer to the exact passage in a source document, and the team needs to be able to defend that link to an auditor.

Problem

Just asking the model to 'include citations' is not enough. Citations that the model writes freely are ungrounded — they look real but may point to documents that were never retrieved or quote text that does not appear in the source. The binding from a span of the answer to a span of evidence has to be created by the retrieval pipeline and carried through generation and delivery; otherwise the citations cannot be trusted, and the whole audit story collapses.

Forces

  • The chunk-to-claim binding can be at document, chunk, or span level; finer granularity is more useful but harder.
  • Models given retrieved context may still fabricate citations to documents that were not retrieved.
  • Span-level alignment requires the model to emit either citation markers or structured outputs that the runtime resolves.
  • Aggregating citations from multiple chunks behind one claim is common — single-source attribution is too narrow.
  • Distinct from citation-streaming, which is the delivery shape; this is the binding itself.

Example

A legal-research assistant retrieves case excerpts and must produce an analysis where every claim cites the source case. The team assigns each retrieved chunk a stable `chunk_id` and prompts the model to emit a structured output: a list of claims, each with `text` and `supporting_chunk_ids`. A validator rejects any `chunk_id` not in this turn's retrieval registry. The UI renders each claim with footnote-style links to the cited cases. When the model is uncertain it returns fewer claims rather than fabricating citations; the citation-attribution binding is what the auditor checks.

Diagram

Solution

Therefore:

During retrieval, assign each chunk a stable source-id and keep a registry of which ids were retrieved for this turn. During generation, either (a) prompt the model to emit citation markers (`[src-id]`) at the chosen granularity, then resolve and validate them against the registry, refusing any id that was not retrieved; or (b) use a structured-output schema that has a `claims` array with `text` and `supporting_chunk_ids` fields. At delivery, attach the resolved source records to the answer so the UI can render the binding. Pair with citation-streaming (delivery), naive-rag / contextual-retrieval (the upstream retrieval), and hallucinated-citations (the anti-pattern that ignores binding).

What this pattern forbids. Every claim in the answer must be bound to at least one retrieved-source id from this turn's retrieval registry; citations to ids not in the registry must be rejected before delivery.

The smaller patterns that complete this one —

  • usesNaive RAG★★Condition the generator on top-k chunks retrieved from an external dense index so knowledge lives outside parameters.
  • usesContextual RetrievalPrepend a short LLM-generated description to each chunk before embedding so the chunk carries its situating context.

And the patterns that stand alongside it, or against it —

  • complementsCitation Streaming★★Stream citations alongside generated text so the UI can render source links in place as content appears.
  • alternative-toHallucinated CitationsAnti-pattern: let the model emit citations as free text and trust them.
  • complementsStructured Output★★Constrain the model's output to conform to a JSON Schema (or similar typed shape).

Neighbourhood

Click any neighbour to follow the language. Scroll to zoom, drag to pan.