Episodic Summaries
Compress past episodes into summaries that preserve gist while shedding token cost.
Problem
Without some form of compaction, the agent has only two bad options. Either the context grows unboundedly until it overflows the window, at which point the call fails or the most recent state is silently dropped. Or a sliding-window strategy truncates the oldest content, which lets important early facts (the original task, an early decision the agent made, a constraint the user stated up front) fall off the back even though the agent still needs them. The team needs a way to summarise older history into compact episodes that retain the load-bearing facts while shedding the verbatim noise.
Solution
On a schedule (or at thresholds), summarise blocks of recent thoughts/conversation into compact representations. Store summaries in a higher tier; archive originals. Reads consult summaries first, originals on demand.
When to use
- Conversation or thought history grows unboundedly without compaction.
- Summaries can preserve gist while shedding token cost meaningfully.
- Summarised tiers are consulted first with originals available on demand.
Open the full interactive page →
Diagram, neighbourhood map, code examples, related patterns and full provenance.
Related
- Five-Tier Memory Cascade
- Reflexion
- Context Window Packing
- Short-Term Thread Memory
- Self-Archaeology
- Salience Attention Mechanism
- Dream Consolidation Cycle
- Cluster-Capped Insight Store
- Sleep-Time Compute
- Episodic Memory
- Procedural Memory
- Agentic Memory
- Context Window Dumb-Zone Cap
- Information Chunking for Agent Memory