V · MemoryEmerging

Context Compaction

also known as Conversation Summarisation Checkpoint, 压实, Rolling Context Digest

When the context window nears its limit, replace the older conversation span with a model-written digest that preserves decisions, commitments, and active constraints while discarding noise, so the agent keeps running without losing the thread.

Context

A long-running agent accumulates turns — tool calls, raw observations, intermediate reasoning — until the conversation approaches the model's context-window limit. The agent is mid-task and cannot simply stop, but it also cannot fit the full history into the next request. Most of the older turns are process noise: superseded plans, large tool dumps, abandoned branches. The decisions and conclusions those turns produced still matter.

Problem

A fixed context window caps how much history an agent can carry, but a long task generates more history than fits. Truncating the oldest turns blindly drops the decisions and commitments the agent still depends on; keeping everything overflows the window or inflates cost and latency on every subsequent call. The agent needs to shed token volume without shedding the conclusions that volume produced.

Forces

  • Context windows are bounded; long-horizon tasks are not.
  • Older turns are mostly process noise, but the decisions buried inside them are load-bearing.
  • Summarising too early discards detail still in use; summarising too late risks overflow mid-step.
  • A lossy digest can drop a constraint the agent will then silently violate.
  • Re-summarising on every turn is expensive; summarising rarely lets the window fill and overflow.

Example

A coding agent works through a multi-hour refactor, accumulating dozens of file reads, test runs, and diffs. As the conversation nears the model's context limit, the runtime summarises the earliest two-thirds of the session into a digest — migrated the auth module to the new API, agreed not to touch the billing tests, three files still outstanding — and drops the raw file dumps. The agent continues from the digest plus its last few turns, never losing the agreed constraint even though the original messages are gone.

Diagram

Solution

Therefore:

Track context-window utilisation. When it crosses a threshold (for example 80% of the window), run a compaction pass: feed the older span of the conversation to the model with an instruction to produce a dense digest that preserves goals, decisions, open commitments, and any constraints the agent must still honour, while discarding raw tool output, superseded plans, and dead-end reasoning. Replace that span in the working context with the digest, keep the most recent turns verbatim so local continuity survives, and resume. Pin content that must never be compacted away — the original task statement and hard constraints — outside the compactable region. Anthropic ships this as automatic compaction in Claude Code and the Agent SDK; the Chinese context-engineering literature names it 压实 (compaction).

What this pattern forbids. The agent must not shrink older context by blind truncation; reduction has to go through a summarisation pass that is instructed to preserve decisions, open commitments, and active constraints. Pinned content — the task statement and hard constraints — must be excluded from the compactable region and never summarised away.

And the patterns that stand alongside it, or against it —

  • complementsContext Window Packing★★Choose what fits in the context window each turn given a fixed token budget.
  • complementsSleep-Time Compute·During idle or downtime, run the model offline against the user's standing context to pre-compute dense summaries and likely future answers, so test-time latency and cost drop when the user actually asks.
  • complementsTool-Result EvictionOnce a tool's raw output has been consumed, replace it in the live context window with a short marker of what was done, reclaiming tokens without losing that the call happened.

Neighbourhood

Click any neighbour to follow the language. Scroll to zoom, drag to pan.