Structure & Data

FTI LLM Pipeline Split

Decompose an LLM/RAG system into three independently-deployable pipelines — feature, training, inference — communicating only via a feature store and a model registry.

Problem

A monolithic LLM application makes every change touch every team. Re-embedding the corpus requires a deploy that the inference path inherits. Bumping the SFT recipe forces retraining tied to the inference release cycle. Serving SLOs are held hostage by data-pipeline failures. Without a clean decomposition along the F/T/I axes, teams step on each other and the system drifts toward incoherent versioning.

Solution

Define three pipelines. Feature pipeline: ingests raw documents, cleans, chunks, embeds, writes to the feature store (typically a vector DB plus a document store). Training pipeline: reads features from the store, fine-tunes (SFT, DPO), writes models to the model registry. Inference pipeline: reads from the feature store at request time, loads the model from the registry, generates. Communication is only via the two integration surfaces — no direct code or service calls cross pipelines. Each pipeline deploys on its own cadence.

When to use

Feature, training, and inference have materially different cadences and ownership.
MLOps tooling (feature store, model registry) is available or worth standing up.
Independence of deploys is a real value to the organisation.

Open the full interactive page →

Diagram, neighbourhood map, code examples, related patterns and full provenance.

Problem

Solution

When to use

Related