Methodology · LLM-App Engineering

FTI Pipeline Architecture

Split a machine-learning or LLM system into three separate pipelines, joined only by a feature store and a model registry, so each one can scale, be swapped out, and be owned on its own.

Description

Split any machine-learning or LLM system into three separate pipelines: Feature, Training, and Inference. They connect only through a feature store and a model registry. Each pipeline runs on its own schedule, has its own dependencies, and scales on its own. They hand work to each other through versioned artefacts, never through direct calls. This lets each pipeline use the tools that fit it. You can use Spark for features, GPU jobs for training, and fast serving for inference. It also lets different teams own different pipelines.

When to apply

Use this when you design any real machine-learning or LLM system that has to run beyond a notebook, such as recommendation, search, retrieval-augmented generation, an LLM application, or classification. It pays off most when different teams own different parts, when features and training need to scale at different rates, or when serving must be faster than training. Do not apply it to a single-team prototype that fits in one script. The three-way split is just overhead until you hit at least two of the three problems it solves: separate scale, separate ownership, and separate lifecycles.

What it involves

Name the three pipelines
Define the feature contract
Build the feature pipeline
Build the training pipeline
Build the inference pipeline
Operate each pipeline independently

Open the full interactive page →

Diagram, neighbourhood map, code examples, related patterns and full provenance.

Description

When to apply

What it involves

Related