XIV · Anti-PatternsAnti-pattern

Agent Bullwhip Effect

also known as Self-Induced Demand Amplification, Policy-Induced Order Variability

Anti-pattern: distributed supply-chain or replenishment agents, each optimising locally, amplify order variability through their own decision policy, so a local demand spike triggers synchronised chain-wide reordering and supplier stockouts that propagate backward.

Context

Multiple agents run a supply chain or replenishment network, each responsible for one node — a store, a warehouse, a supplier — and each reordering to optimise its own stock against observed demand. The agents act in parallel on the same upstream signal. Classical supply-chain theory already knows that ordering policies can amplify demand variability upstream — the bullwhip effect.

Problem

When each agent optimises its own node against a demand spike, their reorders synchronise: a small bump at the stores becomes a large coordinated order upstream, which causes supplier stockouts that ripple backward through the network. The amplification does not come from any agent failing or hallucinating — each is doing its job correctly — it comes from the agents' collective decision policy reacting to the same signal at once. The more agents and the tighter their coupling to demand, the larger the swing, and the variability the network creates is the agents' own, not the customers'.

Forces

  • Each agent optimising its own node locally is individually correct, yet the aggregate of those local optima amplifies the shared signal.
  • Reacting quickly to a demand spike is good for one node but, done by all nodes at once, manufactures a coordinated surge upstream.
  • The amplified variability is generated by the agents' policy, distinct from the variability inherited from real customer demand.
  • Damping the reaction reduces the swing but slows each node's response to genuine demand changes.

Example

A grocery chain runs a replenishment agent per store and per depot. A short promotion spikes demand for one product at many stores at once; every store-agent reorders aggressively, the depot-agents see the combined surge and over-order from the supplier, and the supplier stocks out. When the promotion ends, the inflated orders unwind into overstock. Customer demand barely moved; the swing was the agents' own.

Diagram

Solution

Therefore:

Recognise that a network of locally-optimising agents can amplify the very signal it reacts to, and design against it at the system level rather than per node. Add explicit demand-signal dampening so a spike at one node does not translate into a full synchronised reorder upstream, and coordinate or stagger the agents' ordering so they do not all react in lockstep. Measure and separate the variability the agents' policy introduces from the variability inherited from real customer demand, and tune the policy to minimise the former. The control lives in the collective ordering policy and its damping, not in any single agent's local optimisation.

What this pattern forbids. Reordering policy must not be left to per-node local optimisation alone; the demand signal is dampened and ordering is coordinated or staggered so the network cannot amplify its own variability, and policy-induced variability is tracked separately from real customer demand.

The patterns that counter or replace it —

  • complementsCascading Agent FailuresAnti-pattern: build a multi-agent system where one agent's failure or hallucination propagates as input to peers, until the whole system has drifted.
  • complementsCompound Error DegradationAnti-pattern: deploy a long-horizon agent without modelling that per-step accuracy multiplies across the trajectory.
  • complementsMulti-Agent on Sequential WorkloadsAnti-pattern: split a fundamentally sequential workload across multiple agents, degrading accuracy by 39–70% with no parallelization benefit.

Neighbourhood

Click any neighbour to follow the language. Scroll to zoom, drag to pan.