Test-Time Memorization (Titans)

also known as Inference-Time Memory, Titans Memory Module

Memory module that learns at inference time by incorporating recent inputs into its parameters during the session rather than relying solely on pre-trained weights.

Context

A long-running agent task generates new information that should influence later decisions in the same task — but happens after training. Standard models either lose this information at session end (no learning) or require expensive retraining cycles to incorporate it.

Problem

Pre-trained-only models can't learn within a session. Retraining is too slow and expensive to do per-session. RAG retrieves but doesn't internalize. The agent needs a way to memorize within a session that's faster than retraining but more integrated than retrieval.

Forces

Test-time training adds inference-time compute cost.
Memory module design affects what's memorizable and at what fidelity.
Concurrency issues — multiple sessions writing to the same module would interfere.

Example

A research-agent session processes 200 papers over 6 hours. With standard model: early papers' content fades by paper 150. With Titans test-time memorization: each processed paper updates the memory module; by paper 150 the model effectively recalls patterns from paper 5 without RAG retrieval. End-of-session synthesis is dramatically better.

Diagram

flowchart TD Step1[Step 1: input] --> Model[Model + Memory Module] Model --> Update1[Update memory module params] Update1 --> Step2[Step 2: input] Step2 --> Model Model --> Output[Output benefits from in-session memory] Session[End of session] --> Reset[Reset memory module state]

Solution

Therefore:

Behrouz et al. 2024 — Titans architecture. A neural memory module sits alongside the main model; during a session, inputs trigger updates to the module's parameters (gradient steps at inference time). Later steps in the same session benefit from this in-session learning. Module state is per-session and ephemeral. Pair with episodic-memory, agentic-memory, landmark-attention, agent-resumption.

What this pattern forbids. Memory module parameter updates may not persist beyond session end without explicit promotion to LTM; no cross-session bleed of in-session learned state is allowed by default.

And the patterns that stand alongside it, or against it —

complementsEpisodic Memory★★— Record past events as time-stamped first-person experiences the agent can recall later, separately from extracted facts (semantic) and learned how-to (procedural).
complementsAgentic Memory★— Expose memory management as first-class tool actions (ADD, UPDATE, DELETE, RETRIEVE, SUMMARY, FILTER) the LLM chooses at every step, trained end-to-end so short-term and long-term memory live under one learned policy.
complementsLandmark Attention·— Long-context attention mechanism placing sparse landmark tokens across very long inputs so the model jumps directly to relevant sections via landmark lookup rather than scanning linearly.
complementsAgent Resumption★★— Persist agent execution state so a long-running run survives restarts, deploys, or user disconnects.
complementsLarge Reasoning Model (LRM) Paradigm★— Route reasoning-heavy tasks to a reasoning-tuned model that trades inference time for deliberation, rather than to a fast LLM that exhibits premature-closure.

Neighbourhood

Click any neighbour to follow the language. Scroll to zoom, drag to pan.

References

Titans: Learning to Memorize at Test Time
paper

Provenance

Source: patterns/test-time-memorization.md on GitHub · commit 4002557 · view history
Added to catalog: 2026-05-23
Last updated: 2026-05-23
Contribute: open an issue or PR at github.com/agentpatternscatalog/patterns.