Pathway
Pathway is a Python and Rust framework that builds streaming data and RAG pipelines on a differential-dataflow engine, keeping vector indexes synchronised with their sources as data changes.
Description
Pathway runs the same pipeline code over batch and streaming data, with a Rust engine that recomputes results incrementally as inputs change. Its LLM extension provides parsers, embedders, splitters, and an in-memory real-time vector index, plus document-store components that re-index automatically on new data. Teams use it to build retrieval-augmented generation pipelines whose indexes stay current without a separate batch re-indexing job. The framework integrates with LlamaIndex and LangChain.
Solution
Pathway is a data-pipeline runtime rather than an agent loop. Connectors ingest documents and change events; the differential-dataflow engine parses, chunks, embeds, and indexes them incrementally; and a vector or document store serves retrieval queries. A question-answering layer retrieves the top documents for a query and passes them to an LLM, with the index kept in sync as sources change.
Primary use cases
- real-time retrieval-augmented generation pipelines
- live document indexing and vector search
- incremental ETL over streaming and batch data
- continuously synchronised knowledge bases for agents
Open the full interactive page →
Diagram, neighbourhood map, code examples, related patterns and full provenance.