Episodic Memory

also known as Event Memory, Experience Store, Memory Stream

Record past events as time-stamped first-person experiences the agent can recall later, separately from extracted facts (semantic) and learned how-to (procedural).

Context

An agent needs to remember what happened — when, in what order, with what context and outcome. This is the autobiographical layer: a record that yesterday the user asked about X, the agent answered Y, the user pushed back, and the two converged on Z. Whether the events are conversations, tool calls, observations, or internal reasoning steps, the function is the same: preserve the temporal-experiential structure of past interactions so the agent can reflect, learn, and surface relevant prior episodes.

Problem

If the agent has only a fact store, it can answer 'what is true' but not 'what happened' — it loses the ability to learn from specific past interactions, to surface relevant prior episodes by recency or salience, or to reflect on its own behaviour. If the agent collapses every interaction into facts at write-time, it destroys the causal chain — the user said this, then the agent did that, then it broke — that makes debugging and reflection possible. The CoALA framework names episodic memory as a distinct long-term type for this reason: the agent needs a layer that preserves events as events, with their temporal structure intact.

Forces

Episodic stores grow unboundedly with time — needs compaction, paging, or salience-based pruning.
Retrieval by similarity alone misses temporal queries ('what did I do yesterday') and recency-sensitive queries.
Raw episode replay is too noisy for prompt context — needs salience scoring, summarisation, or reflection passes to be useful.
Privacy and tenant isolation: episodes contain user content and must respect session and user boundaries.

Example

A coding agent has worked with a developer across hundreds of tickets over six months. The developer later asks the agent to explain how the team ended up with the weird workaround in the auth module. A pure semantic store would return facts like (auth-module, uses-workaround, true) — useless. An episodic store returns the actual sequence: on 2026-02-14 the developer flagged a CVE, on 2026-02-15 the agent proposed a fix, the proposed fix broke a downstream test, on 2026-02-16 they agreed on a workaround instead with a TODO. The agent can now answer the why. The episodic store also feeds a weekly reflection pass that consolidates 'workaround in auth-module' into a semantic fact and 'CVE-flag → propose fix → test → workaround-with-TODO' into a procedural template.

Diagram

flowchart TD Obs[Observation / interaction] --> Stamp[Stamp with time + importance] Stamp --> Ep[(Episodic memory)] Ep -.substrate.-> V[Vector store] Ep -.substrate.-> Log[Append-only log] Ep -.substrate.-> J[Structured journal] Q[Query: what happened around X?] --> Ret[Retrieve by recency + similarity + salience] Ret --> Ep Ret --> Top[Top-k relevant episodes] Top --> Ctx[Prepend to context] Ep --> Refl[Reflection / consolidation pass] Refl --> Sem[Semantic memory: extracted facts] Refl --> Proc[Procedural memory: learned recipes] Refl --> Sum[Episodic summaries: compacted tier]

Solution

Therefore:

Park et al.'s Generative Agents memory stream (2023) is the canonical implementation: every observation is logged with a timestamp and an importance score; retrieval combines recency, relevance, and importance; a periodic reflection pass derives higher-level insights from clusters of recent episodes. LangMem's episodic channel stores past interactions for few-shot retrieval and procedure distillation. Substrate is orthogonal to function: vector store ([[vector-memory]]), append-only log ([[append-only-thought-stream]]), or structured journal can all back episodic memory. Compaction is typically delegated to [[episodic-summaries]]; consolidation into facts feeds [[semantic-memory]]; consolidation into skills feeds [[procedural-memory]].

What this pattern forbids. Forbids collapsing every interaction into facts at write-time. Episodes keep their identity (timestamp, context, outcome) and are queried as events; extraction into facts or skills is a separate, downstream step.

The smaller patterns that complete this one —

usesVector Memory★★— Store memories as embeddings in a vector index and retrieve the most semantically similar items at query time.
usesAppend-Only Thought Stream★— Make the agent's thought log append-only so the agent cannot rewrite its own history.
usesEpisodic Summaries★★— Compress past episodes into summaries that preserve gist while shedding token cost.

And the patterns that stand alongside it, or against it —

complementsSemantic Memory★— Maintain a dedicated store of what the agent holds to be true about the user and the world, separate from event records (episodic) and learned how-to (procedural).
complementsProcedural Memory★— Maintain a third agent memory type alongside episodic (past events) and semantic (facts): procedural memory captures *learned how-to* — reusable skills, workflows, and self-rewritten system instructions that map situations directly to actions.
complementsSalience Attention Mechanism★— Score every candidate memory item with a weighted salience function so each tick attends to a small, relevant top-k subset rather than re-reading all memory.
complementsHippocampal Rehearsal·— Lift archived memory items back into short-term tiers when something re-attends to them.
composes-withAgentic Memory★— Expose memory management as first-class tool actions (ADD, UPDATE, DELETE, RETRIEVE, SUMMARY, FILTER) the LLM chooses at every step, trained end-to-end so short-term and long-term memory live under one learned policy.
complementsMemory-Type Storage Specialization★★— Use different storage technologies optimized per memory type — fast in-memory stores (Redis-class) for episodic, vector databases (Pinecone/Weaviate) for semantic, relational or workflow engines for procedural — instead of one general store for everything.
complementsThree Layers of Agentic AI Memory★— Architect agent memory as three integrated concentric layers — Short-Term Memory (outer), Long-Term Memory (middle), Feedback Loops (core) — operating together as a unit rather than as separable optional components.
complementsTest-Time Memorization (Titans)·— Memory module that learns at inference time by incorporating recent inputs into its parameters during the session rather than relying solely on pre-trained weights.