Streamkap
Type: low-code · Vendor: Streamkap · Language: proprietary · License: proprietary · Status: active · Status in practice: emerging · First released: 2022
Streamkap is a managed platform that streams change-data-capture events from databases into destinations such as vector stores in real time so downstream embeddings stay in sync with the source.
Description. Streamkap connects databases, applications, and warehouses through change-data-capture and event streaming, moving data from source to destination with sub-50ms latency. For AI workloads it streams each insert, update, and delete from the source into a vector database, triggering an embedding update so retrieval data does not go stale. It processes only the rows that changed rather than re-embedding the whole dataset in batch.
Agent loop shape. Streamkap is a streaming-pipeline service, not an agent loop. It captures change events from a source database, optionally transforms them in flight, and delivers them to a sink. When the sink is a vector store, each captured change drives an embedding update, so the index served to a downstream agent or RAG system reflects the source as it changes.
Primary use cases
- real-time CDC pipelines from databases to vector stores
- keeping embeddings and RAG context current with source data
- streaming ETL into warehouses and applications
- incremental embedding updates for recommendation and agent context
Key concepts
- CDC engine → cdc-vector-sync (docs) — Captures every INSERT, UPDATE, and DELETE from the source database's transaction log so downstream consumers see each change as an event rather than re-scanning the table.
- Embedding service → streaming-feature-pipeline (docs) — A pipeline stage that consumes CDC events and decides whether to generate a new embedding, so embedding cost is proportional to the rate of change rather than dataset size.
- Sources and destinations → pipes-and-filters (docs) — Streamkap connects a variety of sources to destinations and streams data sub-second between them, with pipelines configured rather than coded.
Patterns this low-code implements —
- ★★CDC-Driven Vector Sync
Uses change-data-capture so the source database stays the single writer; every insert/update/delete is streamed in real time and triggers an embedding update that keeps the vector store in sync.
- ★Streaming Feature Pipeline
Rather than re-embedding the full dataset on a batch schedule, Streamkap streams each source change as it happens and processes only what actually changed, building the embedding feature incrementall…
- ★★Pipes and Filters
The vector-sync path is a staged stream pipeline - source DB to a CDC engine to Kafka to an embedding service to the vector DB - where each stage consumes the previous stage's events and emits to the…
Neighbourhood
Click any neighbour to follow the lineage. Scroll to zoom, drag to pan.