Full-Code · Orchestration Frameworksactive

RAGFlow

Type: full-code  ·  Vendor: InfiniFlow  ·  Language: Python, TypeScript  ·  License: Apache-2.0  ·  Status: active  ·  Status in practice: mature

Links: homepage docs repo

Open-source RAG engine that pairs deep document-understanding (DeepDoc) layout-aware parsing with an agentic, graph-orchestrated workflow runtime, MCP support, and an extensive citation/traceability surface.

Description. RAGFlow (78k+ GitHub stars, Apache-2.0) is InfiniFlow's open-source RAG engine. Its defining feature is DeepDoc — a layout-aware, template-driven document parser that handles PDF, Word, Excel, slides, scanned copies, structured data, and web pages with format-specific chunking strategies rather than naive splitting. From v0.8 onward RAGFlow became agentic: a graph-based workflow editor lets users compose retrieval + reasoning + tool stages, with MCP support, GraphRAG, and (from late 2025) memory for AI agents. Citations are first-class and traceable to source chunks with visualisations.

Agent loop shape. Document ingestion runs DeepDoc parsing → template-based chunking → embedding → indexing (with optional GraphRAG knowledge-graph construction). At query time, a no-code graph workflow defines retrieval + rerank + reasoning + tool stages. Each query enters the graph at the start node and traverses with branch/loop semantics. MCP exposes retrieval to external agents.

Primary use cases

  • enterprise document QA over complex format mixes (PDF, Office, scans, structured data)
  • agentic RAG with no-code workflow composition over retrieval + reasoning + tools
  • high-fidelity citation rendering with chunk-level provenance to source
  • MCP-server integration so external agents can call RAGFlow's retrieval as tools

Key concepts

  • DeepDoc (docs)Layout-aware, template-driven parser for complex document formats.
  • Graph workflow editor orchestrator-workersNo-code DAG for composing RAG + agent stages.
  • Template-based chunkingPer-format chunking templates instead of naive token splits.
  • Citation rendering citation-attributionTraceable chunk-level citations attached to answers.

Patterns this full-code implements

Neighbourhood

Click any neighbour to follow the lineage. Scroll to zoom, drag to pan.