Agent Middleware Chain

also known as Agent Interceptor Pipeline, Pre/Post Middleware

Wrap every model call, tool call, and memory access in a composable pre/execute/post interceptor pipeline so cross-cutting concerns attach without touching agent or orchestrator code.

Context

An agent runtime accumulates cross-cutting concerns: structured logging of every model call, rate-limit enforcement on third-party APIs, PII redaction on inputs and outputs, guardrail evaluation, latency metrics, an approval gate that may pause a call. Each concern needs to fire on the same set of touchpoints — model calls, tool calls, memory reads/writes — without each concern reimplementing the wiring.

Problem

If each concern is implemented as a wrapper at the agent or orchestrator layer, the runtime accretes a deep stack of decorators, the order is implicit, and adding or removing a concern requires editing agent code. Worse, concerns differ in shape — some need to see the request before the call, some need to mutate the response, some need to catch errors. Without a uniform middleware surface, each concern carries its own ad-hoc hook code and the cross-cutting layer is no longer composable or testable in isolation.

Forces

Pre-execution interceptors (request modification, validation) need the request; post-execution interceptors (response logging, redaction) need the response; error handlers need the exception.
Ordering matters — guardrails before logging, redaction before persistence.
Middleware must compose at runtime so a team can add or remove a concern by configuration.
Each middleware must remain testable in isolation against a synthetic call.

Example

An agent runtime mounts five middlewares in order: rate-limit, PII-redact-in, guardrail-eval, metrics, approval-gate. Every model and tool call flows through the chain forward, then through the reverse chain on response. Adding a new compliance log later is a single registration in the chain config — no agent code is touched.

Diagram

flowchart LR Req[Request] --> M1[M1.process_request] --> M2[M2.process_request] --> M3[M3.process_request] --> Call[Underlying call] Call --> R3[M3.process_response] --> R2[M2.process_response] --> R1[M1.process_response] --> Resp[Response] Call -.error.-> E3[M3.process_error] -.-> E2[M2.process_error] -.-> E1[M1.process_error] -.-> Err

Solution

Therefore:

Define a BaseMiddleware with three hooks: process_request (called before the underlying call, may modify or short-circuit), process_response (called after, may mutate the response), process_error (called on exception). A MiddlewareChain runs the chain forward through process_request, invokes the underlying call, then runs the chain in reverse through process_response. Mount the chain at the runtime layer — every model call, tool call, and memory access flows through it. Cross-cutting concerns are then registered, not coded into agents.

What this pattern forbids. Cross-cutting concerns may not be coded directly into agent or orchestrator logic; they must register through the middleware contract so order is explicit and the chain is reviewable.

The smaller patterns that complete this one —

usesInput/Output Guardrails★★— Validate inputs before they reach the model and outputs before they reach the user.
usesPII Redaction★★— Detect and remove personally identifiable information from inputs to and outputs from the model.
usesRate Limiting★★— Cap the number of requests, tokens, or tool calls per user (or session) within a time window.

And the patterns that stand alongside it, or against it —

complementsDecision Log★★— Persist the agent's reasoning trace alongside its actions so post-hoc review can explain why.
complementsKill Switch★— Provide an out-of-band control plane to halt running agent instances without redeploy.
composes-withPolicy-as-Code Gate★— Evaluate every proposed agent action against externally-managed machine-readable policies before dispatch, so compliance authorship lives outside the prompt and outside the agent code.

Neighbourhood

Click any neighbour to follow the language. Scroll to zoom, drag to pan.

Used in recipes

Agent Runtime Cross-Cutting
core

Used in frameworks

References

Provenance

Source: patterns/agent-middleware-chain.md on GitHub · commit 135ae3c · view history
Added to catalog: 2026-05-23
Last updated: 2026-05-23
Contribute: open an issue or PR at github.com/agentpatternscatalog/patterns.