Methodology · LLM-App Engineeringprovenverified

FTI Pipeline Architecture

also known as feature-training-inference architecture, three-pipeline ML architecture

Applies to: llm-apprag-systemagent

Tags: ftipipelinefeature-storemodel-registry

Split any machine-learning or LLM system into three separate pipelines: Feature, Training, and Inference. They connect only through a feature store and a model registry. Each pipeline runs on its own schedule, has its own dependencies, and scales on its own. They hand work to each other through versioned artefacts, never through direct calls. This lets each pipeline use the tools that fit it. You can use Spark for features, GPU jobs for training, and fast serving for inference. It also lets different teams own different pipelines.

Methodology process overview

Intent. Split a machine-learning or LLM system into three separate pipelines, joined only by a feature store and a model registry, so each one can scale, be swapped out, and be owned on its own.

When to apply. Use this when you design any real machine-learning or LLM system that has to run beyond a notebook, such as recommendation, search, retrieval-augmented generation, an LLM application, or classification. It pays off most when different teams own different parts, when features and training need to scale at different rates, or when serving must be faster than training. Do not apply it to a single-team prototype that fits in one script. The three-way split is just overhead until you hit at least two of the three problems it solves: separate scale, separate ownership, and separate lifecycles.

Inputs

  • Raw data sourcesYour source-of-truth data, such as events, documents, telemetry, and transactional databases, that the feature pipeline will transform.
  • Feature store choiceThe feature store you picked, either offline plus online or a single store. It becomes the contract between the feature pipeline and the pipelines that read from it.
  • Model registry choiceThe registry you picked, such as Comet, MLflow, or SageMaker. It holds versioned trained models with their metadata and lineage.

Outputs

  • Feature pipelineOne or more standalone jobs that read raw data, compute features, and write them to the feature store on a set schedule.
  • Training pipelineJobs that read features and labels from the feature store, train, evaluate, and publish to the model registry.
  • Inference pipelineOne or more services that load a registered model and serve predictions. They read any features needed at request time from the feature store.

Steps (6)

  1. Name the three pipelines

    Write down feature, training, and inference as three separate components. Each gets its own owner, its own schedule, and its own failure modes. Commit to this before any code lands.

    usesFTI LLM Pipeline SplitPipeline Triad Pattern

  2. Define the feature contract

    Write the feature schema in the feature store. Producers and consumers agree on this contract only, not on the pipelines on either side. Treat any schema change as a versioned event.

    usesStreaming Feature Pipeline

  3. Build the feature pipeline

    Read the raw sources, compute features, and write them to the feature store. Pick the runtime that fits, such as Spark, Flink, or a change-data-capture stream. Do not tie it to the training pipeline's runtime.

  4. Build the training pipeline

    Read features and labels from the store. Train, evaluate, and publish the model to the registry with versioned metadata: data version, code version, and test scores. The registry entry is the only thing the inference pipeline sees.

  5. Build the inference pipeline

    Load the model from the registry by version tag. Read request-time features from the feature store, usually the online one. Serve predictions. Promote new model versions through the registry, not by redeploying code.

  6. Operate each pipeline independently

    Each pipeline scales, schedules, alerts, and retries on its own. Feature lag, a training failure, and slow inference are three separate incidents with three separate owners. They are not one vague 'the ML system is broken'.

Framework-specific instructions

Pick a framework and generate a framework-targeted rewrite of this methodology's steps.

Choose framework

AI-generated for Agent Development Kit (ADK) (Google) — verify against official docs.

Principles

  • Three pipelines, two contracts: the feature store and the model registry. The contracts are the only coupling.
  • Each pipeline picks its own runtime. Couple pipelines through code instead of artefacts and you lose the whole split.
  • Schema changes to a feature or a model are versioned events, not silent edits.
  • Independence is the point: separate scale, separate ownership, separate failure surfaces.

Known failure modes (2)

Related patterns (4)

Related compositions (2)

Related methodologies (2)

Sources (2)

Provenance

  • Added to catalog:
  • Last updated:
  • Verification status: verified