Retrieval & RAG

Hybrid Search

Combine sparse lexical retrieval (BM25) with dense vector retrieval and fuse the results.

Problem

Dense vector retrieval handles paraphrase and semantic similarity but misses queries that hinge on an exact identifier the embedding has flattened away. Sparse keyword retrieval — BM25 and similar lexical methods — handles exact terms but misses paraphrased queries whose vocabulary does not overlap with the source text. Picking either method alone means leaving recall on the table for whichever query shape was not chosen, and no downstream re-ranking stage can rescue a chunk that was never retrieved in the first place.

Solution

Index the corpus twice: BM25 for sparse, dense embeddings for semantic. At query time, retrieve top-k from each, fuse with Reciprocal Rank Fusion or weighted aggregation. Pass the fused top-N forward (typically into a reranker). Do not weight raw scores directly; use rank-based fusion (RRF) or score-normalised aggregation, since BM25 and dense scores live on incompatible scales.

When to use

  • Queries mix semantic intent with rare tokens (codes, IDs, proper nouns) that embeddings miss.
  • The corpus is heterogeneous enough that one retriever loses recall on part of it.
  • Latency budget tolerates two retrievers plus a fusion step.

Open the full interactive page

Diagram, neighbourhood map, code examples, related patterns and full provenance.

Related