Naive-RAG-First

also known as RAG-By-Default, Vector-Store-First

Anti-pattern: reach for naive RAG before checking whether the knowledge actually needs retrieval.

Context

A team is starting a new knowledge-grounded agent — a customer-support bot, an internal Q&A assistant, a docs helper — and the field's reference architectures push retrieval-augmented generation (RAG, where the system embeds documents into a vector store and looks up passages by semantic similarity) as the default move. The team builds the vector index before checking where the answer-bearing knowledge actually lives. Often the real source is a database, an internal API, a search service, or a small set of stable documents that would fit in the system prompt.

Problem

When the knowledge lives in a structured store, semantic retrieval over embeddings is the wrong shape: the agent gets approximate, stale passages where a typed SQL query or a single API call would return an exact, fresh answer. The team pays embedding pipeline cost, vector store cost, and re-indexing cost on every update, and quality drops compared to the simpler design because retrieval is solving the wrong problem. Naive RAG also adds an entire failure surface — chunking, embedding drift, recall holes — that a typed tool call simply does not have.

Forces

RAG is on every reference architecture.
Vector stores feel like a moat.
Tool use is sometimes harder to build than RAG.

Example

A team's first move on a new internal Q&A bot is to spin up a vector index over the company wiki. After three weeks they discover that 80 percent of questions are about live ticket status, which is in their helpdesk database, and a vector search over stale wiki pages cannot answer them. They name the failure naive-rag-first: they tear out the index for those queries and route them to a typed helpdesk tool call. RAG stays only for the genuine free-text knowledge questions where the wiki is authoritative.

Diagram

flowchart TD Q[New knowledge need] --> X{Where does it live?} X -->|tool / DB / API| T[Use tool-use] X -->|small + stable| P[Inline in system prompt] X -->|truly external + large| R[Use naive-rag] X -.skipped check.-> AP[Anti-pattern: RAG by default] AP -.causes.-> Bloat[Index sprawl & latency]

Solution

Therefore:

Don't reach for RAG first. Check whether the knowledge lives in a tool (database, API, search service), a scoped system prompt, or a small inlined document. Only adopt RAG when those genuinely do not work. See tool-use, naive-rag for when it does.

What this pattern forbids. Avoiding it imposes an ordering rule: a retrieval pipeline must not be built before checking whether the knowledge lives in a tool, a database query, a scoped prompt, or a small inlined document.

The patterns that counter or replace it —

conflicts-withNaive RAG★★— Condition the generator on top-k chunks retrieved from an external dense index so knowledge lives outside parameters.
alternative-toTool Use★★— Let the LLM produce typed calls against an external toolkit instead of producing free-form text the surrounding system has to parse.
alternative-toSynthetic Filesystem Overlay·— Project heterogeneous enterprise data sources into a single Unix-like tree exposed through filesystem primitives so the agent reuses path semantics it already knows instead of learning a bespoke API per source.
complementsMemory Poisoning✕— Anti-pattern: write to agent long-term memory (vector store, knowledge graph, episodic log) from any surface the agent reads, with no provenance check.
complementsOver-Search and Under-Search✕— Anti-pattern: let an agentic RAG system miscalibrate when to retrieve, so it either re-retrieves information already in context or skips retrieval when its parametric knowledge is stale.

Neighbourhood

Click any neighbour to follow the language. Scroll to zoom, drag to pan.

References

Retrieval-Augmented Generation for Large Language Models: A Survey
paper

Provenance

Source: patterns/naive-rag-first.md on GitHub · commit 4fa1213 · view history
Added to catalog: 2026-04-30
Last updated: 2026-05-21
Contribute: open an issue or PR at github.com/agentpatternscatalog/patterns.