XIV · Anti-PatternsAnti-pattern

Lost in the Middle (Positional Bias)

also known as Long-Context Positional Bias, U-Curve Attention

LLM accuracy on retrieving information from long contexts drops sharply when relevant content sits in the middle of the prompt rather than at the start or end.

Context

A team puts a long context in front of the model (RAG with many chunks, long documents, multi-turn conversation history). Quality on retrieval-style queries depends on where the relevant content sits in the prompt. The team doesn't know about the positional bias and is surprised when middle-of-prompt content gets ignored.

Problem

The model exhibits a U-shaped attention curve: content at the start (primacy) and end (recency) of the prompt is retrieved well; content in the middle is poorly retrieved. The team feeds RAG chunks ordered by relevance — relevant chunks end up in the middle of the prompt — and the model misses them. Distinct from context-fragmentation (which is about simultaneous holding of constraints) by being positional, not relational.

Forces

  • Positional bias is an attention-architecture property; not fixable in prompt.
  • Reordering content to put relevance at the ends costs preprocessing.
  • Some content (instructions) must stay in a known position; can't be reordered freely.

Example

A RAG-based research agent retrieves 20 chunks for each query, packs them into the prompt in order of relevance score. Queries that should be answered from chunk 7-12 (middle) fail; queries answered from chunk 1-3 or 17-20 succeed. Team initially thinks 'the retrieval is wrong' — wrong diagnosis; the retrieval was right, but the model didn't attend to the middle chunks. Fix: reorder so highest-relevance chunks land at start and end, drop the rest.

Diagram

Solution

Therefore:

Acknowledge the bias as architectural. Pair with: landmark-attention (architectural mitigation, requires model support), information-chunking-memory (preprocessing mitigation), context-window-packing (positional design), context-window-dumb-zone (related utilization limit).

What this pattern forbids. No useful constraint; the missing constraint is positional-quality awareness in prompt design.

And the patterns that stand alongside it, or against it —

  • alternative-toLandmark Attention·Long-context attention mechanism placing sparse landmark tokens across very long inputs so the model jumps directly to relevant sections via landmark lookup rather than scanning linearly.
  • alternative-toInformation Chunking for Agent Memory★★Structure inputs into digestible topical segments (chunks) before feeding to short-term memory rather than throwing the full input at the model; reduces overload and increases accuracy (~40% improvement observed in customer-service deployment).
  • alternative-toContext Window Packing★★Choose what fits in the context window each turn given a fixed token budget.
  • complementsContext Window Dumb-Zone CapHold context-window utilization below a working threshold (~40%) to keep the model out of the 'dumb zone' where it begins ignoring earlier instructions and hallucinating.
  • complementsContext FragmentationAnti-pattern: the LLM cannot hold multiple interconnected constraints in mind simultaneously the way human working memory can; it processes each constraint locally and loses the cross-constraint view.

Neighbourhood

Click any neighbour to follow the language. Scroll to zoom, drag to pan.