Full-Code · Orchestration Frameworksactive

HippoRAG

Type: full-code · Vendor: OSU NLP Group · Language: Python · License: Apache-2.0 · Status: active · Status in practice: experimental

Links: homepage repo

Hippocampus-inspired RAG framework that builds a knowledge graph from documents and uses Personalized PageRank for multi-hop retrieval, replacing naive top-k vector search.

Description. HippoRAG is a research RAG framework from OSU NLP Group that draws on hippocampal indexing theory: documents are decomposed into entity-relation triples that form a knowledge graph, and retrieval runs Personalized PageRank from question entities across that graph. The shape targets multi-hop (associative) questions and sense-making over large corpora where naive dense retrieval fails. HippoRAG 2 (the current implementation) builds on the same PPR core and adds deeper passage integration and more effective online use of an LLM. It is distributed as a Python library and reference implementation accompanying the NeurIPS 2024 HippoRAG paper and the ICML 2025 HippoRAG 2 paper.

Agent loop shape. Two-phase pipeline. Offline indexing: documents are passed through an LLM to extract OpenIE-style entity-relation triples that are merged into a persistent knowledge graph with vector embeddings on nodes. Online retrieval: at query time the system extracts entities from the question, seeds Personalized PageRank from those entity nodes over the KG, and returns the top-scoring passages associated with the highest-ranked nodes for an LLM reader (the rag_qa step in the Python API).

Primary use cases

multi-hop (associative) question answering over a document corpus
knowledge-graph-augmented retrieval pipelines
sense-making over long / interconnected contexts
research baselines comparing graph retrieval to dense top-k and to other graph-RAG systems (GraphRAG, RAPTOR, LightRAG)

Key concepts

Hippocampal indexing theory → hippocampus-rag (docs) — Cognitive-neuroscience theory that the hippocampus stores sparse pointers (indices) into neocortical patterns, used as the design metaphor for HippoRAG's KG + retrieval split.
OpenIE knowledge graph → graphrag (docs) — Documents are decomposed by an LLM into entity-relation triples that are merged into a persistent knowledge graph with vector embeddings on nodes (the offline indexing phase).
Personalized PageRank retrieval → hippocampus-rag (docs) — At query time, entities extracted from the question seed Personalized PageRank over the KG; top-scoring nodes' associated passages are returned to the reader LLM.
Neocortex / hippocampus analogy → hippocampus-rag (docs) — The framework's design metaphor: the LLM plays the neocortex (pattern completion / language) and the KG + PPR plays the hippocampus (associative pointer index).
Associative / factual / sense-making memory (docs) — HippoRAG 2 explicitly targets three memory regimes: associative (multi-hop), factual, and sense-making (integrating large complex contexts), reporting gains over baseline RAG on each.
HippoRAG Python API (docs) — Library surface: HippoRAG(save_dir, llm_model_name, embedding_model_name).index(docs), .retrieve(queries), .rag_qa(queries) — combined or separate retrieval-and-QA calls.

Patterns this full-code implements —

References

HippoRAG — Homepage
doc

Provenance

Last analyzed: 2026-05-24
Last updated: 2026-06-17
Verification status: verified