LlamaIndex
RAG-first: retrieval primitives, query engines, eval.
8 patterns supported.
Patterns this framework supports natively
- Agentic RAG — Replace static retrieve-then-generate with autonomous agents that plan, choose sources, retrieve iteratively, reflect, and re-query.
- Naive RAG — Condition the generator on top-k chunks retrieved from an external dense index so knowledge lives outside parameters.
- Hybrid Search — Combine sparse lexical retrieval (BM25) with dense vector retrieval and fuse the results.
- Cross-Encoder Reranking — After cheap bi-encoder or BM25 retrieval, rescore top-N candidates with a cross-encoder that jointly attends over (query, candidate).
- Contextual Retrieval — Prepend a short LLM-generated description to each chunk before embedding so the chunk carries its situating context.
- HyDE — Have the LLM write a hypothetical answer document, embed it, and use it as the retrieval query.
- Citation Streaming — Stream citations alongside generated text so the UI can render source links in place as content appears.
- Eval Harness — Run a held-out dataset against agent versions to detect regressions and measure improvement.