Framework · Workflow Engines

Pathway

Pathway is a Python and Rust framework that builds streaming data and RAG pipelines on a differential-dataflow engine, keeping vector indexes synchronised with their sources as data changes.

Description

Pathway runs the same pipeline code over batch and streaming data, with a Rust engine that recomputes results incrementally as inputs change. Its LLM extension provides parsers, embedders, splitters, and an in-memory real-time vector index, plus document-store components that re-index automatically on new data. Teams use it to build retrieval-augmented generation pipelines whose indexes stay current without a separate batch re-indexing job. The framework integrates with LlamaIndex and LangChain.

Solution

Pathway is a data-pipeline runtime rather than an agent loop. Connectors ingest documents and change events; the differential-dataflow engine parses, chunks, embeds, and indexes them incrementally; and a vector or document store serves retrieval queries. A question-answering layer retrieves the top documents for a query and passes them to an LLM, with the index kept in sync as sources change.

Primary use cases

  • real-time retrieval-augmented generation pipelines
  • live document indexing and vector search
  • incremental ETL over streaming and batch data
  • continuously synchronised knowledge bases for agents

Open the full interactive page

Diagram, neighbourhood map, code examples, related patterns and full provenance.