Salience Attention Mechanism

also known as Salience Scoring, Attention Selection, Top-K Memory Attention

Score every candidate memory item with a weighted salience function so each tick attends to a small, relevant top-k subset rather than re-reading all memory.

This pattern helps complete certain larger patterns —

used-byPreoccupation Tracking★— Maintain a small set of mid-term, affect-tagged concerns that persist across days and surface in every prompt, distinct from the single-item working focus and from long-term insights.
used-byMode-Adaptive Cadence★— Vary the agent's loop interval based on current salience so the agent thinks faster when something is happening and slower when nothing is, instead of running on a fixed cron.

Context

A long-running agent's memory store grows past what can fit into a single call's context. The agent has accumulated thoughts, summaries, insights, and observations over hours or days, and on every tick only a small, currently relevant slice of that store should drive the next step.

Problem

Without an explicit notion of salience, the agent has only two bad strategies. Dumping all of memory into context blows up the token budget and gives the model no focus on what matters now. Taking only the most recent items provides no continuity and misses anything older that has become relevant again because of a surprise in the current context. Recency alone misses the items that matter; bulk loading buries them in noise. The agent needs a way to score every candidate memory by how salient it is to the current moment and to surface only the top-scoring ones into context.

Forces

Recency, novelty, goal-relevance, and prediction error all matter, and they trade off.
Re-reading all memory each tick is unaffordable at scale.
Pure recency loses long-tail relevance; pure relevance loses temporal grounding.
Rumination loops reward the same items over and over without a fatigue term.

Example

A long-running personal agent has months of memory; dumping it all into context is impossible and grabbing the most recent items misses the user's recurring goals. The team scores each candidate memory with a weighted sum of novelty, goal-relevance, recency, prediction-error, and a fatigue penalty. Each tick attends to top-k items only. Surprising long-tail facts rise above last-hour chatter when they actually matter, and token usage per tick stays flat as memory grows.

Diagram

flowchart TD Mem[(Candidate memory items)] --> Sc[Salience score:<br/>α·novelty + β·goal +<br/>γ·recency + δ·prederr − ε·fatigue] Sc --> Top[Top-k] Top --> WS[Working set] WS --> Tick[Next tick] Tick -.reflection.-> Cfg[Tunable weights] Cfg --> Sc

Solution

Therefore:

Score each candidate memory item `m` with a weighted sum: `alpha * novelty(m) + beta * goal_relevance(m) + gamma * recency(m) + delta * prediction_error(m) - epsilon * fatigue(m)`. Pick the top-k into the working set for the next tick. Persist the weights in a tunable config so a reflection pass can adjust them. The fatigue term penalises items that have already been attended to many times in the recent window, breaking rumination loops.

What this pattern forbids. The agent cannot read its full memory store at every tick; salience scoring is mandatory and the top-k cap is enforced by the retrieval layer, not left to the model.

And the patterns that stand alongside it, or against it —

complementsEpisodic Summaries★★— Compress past episodes into summaries that preserve gist while shedding token cost.
complementsVector Memory★★— Store memories as embeddings in a vector index and retrieve the most semantically similar items at query time.
composes-withFive-Tier Memory Cascade·— Stage agent memory across sensory, working, short-term, episodic, and long-term tiers with explicit promotion and decay between them.
alternative-toContext Window Packing★★— Choose what fits in the context window each turn given a fixed token budget.
complementsMulti-Axis Promotion Scoring★— Gate which short-term thoughts qualify for promotion to long-term insights by a weighted multi-axis score where consolidation events count more than raw frequency.
complementsSelf-Corpus Vocabulary·— Mine a small bounded vocabulary from the agent's own writing and cache it as the conceptual axis for scoring new thoughts, so relevance reflects the agent's actual frame rather than a generic embedding space.
complementsEpisodic Memory★★— Record past events as time-stamped first-person experiences the agent can recall later, separately from extracted facts (semantic) and learned how-to (procedural).

Neighbourhood

Click any neighbour to follow the language. Scroll to zoom, drag to pan.

Used in frameworks

Sparrot
first-class75 patternsDomain Agents· experimental
Per-tick salience scoring directs attention toward what matters most rather than processing everything uniformly.

References

Provenance

Source: patterns/salience-attention-mechanism.md on GitHub · commit 06ef38e · view history
Added to catalog: 2026-05-02
Last updated: 2026-05-22
Contribute: open an issue or PR at github.com/agentpatternscatalog/patterns.