Retrieval & RAG

HippoRAG

Build an LLM-extracted schemaless knowledge graph from the corpus and run Personalized PageRank seeded on the query's key concepts so multi-hop retrieval completes in a single pass.

Problem

Single-query dense retrieval lands in one embedding neighbourhood and cannot follow entity-mediated chains across documents. Iterative agentic retrieval reaches the answer but costs an LLM call per hop and the agent has no global view of the graph that connects passages. Community-summary approaches such as GraphRAG handle global queries via map-reduce over pre-built summaries, but their cost and latency are dominated by the summary build and they do not naturally surface a tight path between two concrete entities.

Solution

Offline, prompt an LLM to extract (subject, predicate, object) triples from each passage and store the resulting schemaless graph alongside per-node passage pointers — this is the artificial hippocampal index. At query time, extract the query's key concepts (also via LLM), seed Personalized PageRank on the corresponding graph nodes, run PPR to propagate relevance through entity-mediated edges, and surface the top passages by aggregated PPR mass. Pass the surfaced passages forward to the generator, optionally through a reranker.

When to use

Queries require multi-hop reasoning across entity-mediated paths in the corpus.
An iterative retrieve-then-reason loop is too expensive or too slow in production.
The corpus has stable enough entities that LLM extraction yields a useful graph.
Single-pass latency is required (multi-hop without per-hop LLM calls).

Open the full interactive page →

Diagram, neighbourhood map, code examples, related patterns and full provenance.

Problem

Solution

When to use

Related