XIV · Anti-PatternsAnti-pattern

Naive-RAG-First

also known as RAG-By-Default, Vector-Store-First

Anti-pattern: reach for naive RAG before checking whether the knowledge actually needs retrieval.

Context

A team is starting a new knowledge-grounded agent — a customer-support bot, an internal Q&A assistant, a docs helper — and the field's reference architectures push retrieval-augmented generation (RAG, where the system embeds documents into a vector store and looks up passages by semantic similarity) as the default move. The team builds the vector index before checking where the answer-bearing knowledge actually lives. Often the real source is a database, an internal API, a search service, or a small set of stable documents that would fit in the system prompt.

Problem

When the knowledge lives in a structured store, semantic retrieval over embeddings is the wrong shape: the agent gets approximate, stale passages where a typed SQL query or a single API call would return an exact, fresh answer. The team pays embedding pipeline cost, vector store cost, and re-indexing cost on every update, and quality drops compared to the simpler design because retrieval is solving the wrong problem. Naive RAG also adds an entire failure surface — chunking, embedding drift, recall holes — that a typed tool call simply does not have.

Forces

  • RAG is on every reference architecture.
  • Vector stores feel like a moat.
  • Tool use is sometimes harder to build than RAG.

Example

A team's first move on a new internal Q&A bot is to spin up a vector index over the company wiki. After three weeks they discover that 80 percent of questions are about live ticket status, which is in their helpdesk database, and a vector search over stale wiki pages cannot answer them. They name the failure naive-rag-first: they tear out the index for those queries and route them to a typed helpdesk tool call. RAG stays only for the genuine free-text knowledge questions where the wiki is authoritative.

Diagram

Solution

Therefore:

Don't reach for RAG first. Check whether the knowledge lives in a tool (database, API, search service), a scoped system prompt, or a small inlined document. Only adopt RAG when those genuinely do not work. See tool-use, naive-rag for when it does.

What this pattern forbids. By definition, this anti-pattern imposes no useful constraint; the missing constraint is the failure mode.

And the patterns that stand alongside it, or against it —

  • conflicts-withNaive RAG★★Condition the generator on top-k chunks retrieved from an external dense index so knowledge lives outside parameters.
  • alternative-toTool Use★★Let the LLM produce typed calls against an external toolkit instead of producing free-form text the surrounding system has to parse.
  • alternative-toSynthetic Filesystem Overlay·Project heterogeneous enterprise data sources into a single Unix-like tree exposed through filesystem primitives so the agent reuses path semantics it already knows instead of learning a bespoke API per source.
  • complementsMemory PoisoningAnti-pattern: write to agent long-term memory (vector store, knowledge graph, episodic log) from any surface the agent reads, with no provenance check.
  • complementsOver-Search and Under-SearchAnti-pattern: let an agentic RAG system miscalibrate when to retrieve, so it either re-retrieves information already in context or skips retrieval when its parametric knowledge is stale.

Neighbourhood

Click any neighbour to follow the language. Scroll to zoom, drag to pan.

References

Provenance