V · MemoryExperimental·

Test-Time Memorization (Titans)

also known as Inference-Time Memory, Titans Memory Module

Memory module that learns at inference time by incorporating recent inputs into its parameters during the session rather than relying solely on pre-trained weights.

Context

A long-running agent task generates new information that should influence later decisions in the same task — but happens after training. Standard models either lose this information at session end (no learning) or require expensive retraining cycles to incorporate it.

Problem

Pre-trained-only models can't learn within a session. Retraining is too slow and expensive to do per-session. RAG retrieves but doesn't internalize. The agent needs a way to memorize within a session that's faster than retraining but more integrated than retrieval.

Forces

  • Test-time training adds inference-time compute cost.
  • Memory module design affects what's memorizable and at what fidelity.
  • Concurrency issues — multiple sessions writing to the same module would interfere.

Example

A research-agent session processes 200 papers over 6 hours. With standard model: early papers' content fades by paper 150. With Titans test-time memorization: each processed paper updates the memory module; by paper 150 the model effectively recalls patterns from paper 5 without RAG retrieval. End-of-session synthesis is dramatically better.

Diagram

Solution

Therefore:

Behrouz et al. 2024 — Titans architecture. A neural memory module sits alongside the main model; during a session, inputs trigger updates to the module's parameters (gradient steps at inference time). Later steps in the same session benefit from this in-session learning. Module state is per-session and ephemeral. Pair with episodic-memory, agentic-memory, landmark-attention, agent-resumption.

What this pattern forbids. Memory module parameter updates may not persist beyond session end without explicit promotion to LTM; no cross-session bleed of in-session learned state is allowed by default.

And the patterns that stand alongside it, or against it —

  • complementsEpisodic Memory★★Record past events as time-stamped first-person experiences the agent can recall later, separately from extracted facts (semantic) and learned how-to (procedural).
  • complementsAgentic MemoryExpose memory management as first-class tool actions (ADD, UPDATE, DELETE, RETRIEVE, SUMMARY, FILTER) the LLM chooses at every step, trained end-to-end so short-term and long-term memory live under one learned policy.
  • complementsLandmark Attention·Long-context attention mechanism placing sparse landmark tokens across very long inputs so the model jumps directly to relevant sections via landmark lookup rather than scanning linearly.
  • complementsAgent Resumption★★Persist agent execution state so a long-running run survives restarts, deploys, or user disconnects.
  • complementsLarge Reasoning Model (LRM) ParadigmRoute reasoning-heavy tasks to a reasoning-tuned model that trades inference time for deliberation, rather than to a fast LLM that exhibits premature-closure.

Neighbourhood

Click any neighbour to follow the language. Scroll to zoom, drag to pan.

References

Provenance