Methodology · LLM-App Engineeringprovenverified

FTI Pipeline Architecture

also known as feature-training-inference architecture, three-pipeline ML architecture

Applies to: llm-apprag-systemagent

Tags: ftipipelinefeature-storemodel-registry

Split any machine-learning or LLM system into three separate pipelines: Feature, Training, and Inference. They connect only through a feature store and a model registry. Each pipeline runs on its own schedule, has its own dependencies, and scales on its own. They hand work to each other through versioned artefacts, never through direct calls. This lets each pipeline use the tools that fit it. You can use Spark for features, GPU jobs for training, and fast serving for inference. It also lets different teams own different pipelines.

Methodology process overview

graph LR raw[Raw data sources] --> feat[Feature pipeline] feat --> fs[(Feature store)] fs --> train[Training pipeline] train --> reg[(Model registry)] reg --> inf[Inference pipeline] fs --> inf inf --> pred[Predictions to users] schema[Feature schema contract] -.-> fs modelmeta[Model metadata contract] -.-> reg monitor[Independent monitoring per pipeline] -.-> feat monitor -.-> train monitor -.-> inf

Intent. Split a machine-learning or LLM system into three separate pipelines, joined only by a feature store and a model registry, so each one can scale, be swapped out, and be owned on its own.

When to apply. Use this when you design any real machine-learning or LLM system that has to run beyond a notebook, such as recommendation, search, retrieval-augmented generation, an LLM application, or classification. It pays off most when different teams own different parts, when features and training need to scale at different rates, or when serving must be faster than training. Do not apply it to a single-team prototype that fits in one script. The three-way split is just overhead until you hit at least two of the three problems it solves: separate scale, separate ownership, and separate lifecycles.

Example scenario

A streaming-service recommendations team rebuilt their single monolithic recommender into the three-pipeline shape. The trigger was an incident: a feature-engineering bug quietly degraded both training and inference at the same time for nine hours. The feature pipeline became a Spark job running every fifteen minutes on the Kafka event stream. It wrote user-watch-history and content-embedding features to Feast for offline use and Redis for online use. Training became a separate Kubeflow pipeline running nightly on a GPU pool. It read from Feast offline and wrote model artefacts plus eval scores to MLflow. Inference was a low-latency Go service that read online features from Redis and loaded the model tagged production from MLflow. The payoff came three months later when the feature pipeline needed a Spark version upgrade. The platform team upgraded Spark without touching training or inference, because the feature-store schema was the only contract. At the same time the training team switched from XGBoost to a transformer-based ranker. They just published a new model to the registry with the same input schema. Inference picked it up through tag promotion, with zero code change. The lesson the team kept: the split turned what used to be cross-team coordination meetings into two contract reviews per quarter. The cost was the up-front feature-store and registry work. The saving was every change after that.

Inputs

Raw data sources — Your source-of-truth data, such as events, documents, telemetry, and transactional databases, that the feature pipeline will transform.
Feature store choice — The feature store you picked, either offline plus online or a single store. It becomes the contract between the feature pipeline and the pipelines that read from it.
Model registry choice — The registry you picked, such as Comet, MLflow, or SageMaker. It holds versioned trained models with their metadata and lineage.

Outputs

Feature pipeline — One or more standalone jobs that read raw data, compute features, and write them to the feature store on a set schedule.
Training pipeline — Jobs that read features and labels from the feature store, train, evaluate, and publish to the model registry.
Inference pipeline — One or more services that load a registered model and serve predictions. They read any features needed at request time from the feature store.

Steps (6)

Name the three pipelines
Write down feature, training, and inference as three separate components. Each gets its own owner, its own schedule, and its own failure modes. Commit to this before any code lands.
usesFTI LLM Pipeline Split Pipeline Triad Pattern
Define the feature contract
Write the feature schema in the feature store. Producers and consumers agree on this contract only, not on the pipelines on either side. Treat any schema change as a versioned event.
usesStreaming Feature Pipeline
Build the feature pipeline
Read the raw sources, compute features, and write them to the feature store. Pick the runtime that fits, such as Spark, Flink, or a change-data-capture stream. Do not tie it to the training pipeline's runtime.
Build the training pipeline
Read features and labels from the store. Train, evaluate, and publish the model to the registry with versioned metadata: data version, code version, and test scores. The registry entry is the only thing the inference pipeline sees.
Build the inference pipeline
Load the model from the registry by version tag. Read request-time features from the feature store, usually the online one. Serve predictions. Promote new model versions through the registry, not by redeploying code.
Operate each pipeline independently
Each pipeline scales, schedules, alerts, and retries on its own. Feature lag, a training failure, and slow inference are three separate incidents with three separate owners. They are not one vague 'the ML system is broken'.

Framework-specific instructions

Pick a framework and generate a framework-targeted rewrite of this methodology's steps.

Choose framework

AI-generated for Agent Development Kit (ADK) (Google) — verify against official docs.

Principles

Three pipelines, two contracts: the feature store and the model registry. The contracts are the only coupling.
Each pipeline picks its own runtime. Couple pipelines through code instead of artefacts and you lose the whole split.
Schema changes to a feature or a model are versioned events, not silent edits.
Independence is the point: separate scale, separate ownership, separate failure surfaces.

FTI Pipeline Architecture

Methodology process overview

Steps (6)

Name the three pipelines

Define the feature contract

Build the feature pipeline

Build the training pipeline

Build the inference pipeline

Operate each pipeline independently

Framework-specific instructions

Principles

Known failure modes (2)

Related patterns (4)

Related compositions (2)

Related methodologies (2)

Sources (2)

Provenance