Semantic Memory

also known as Fact Memory, Agent Knowledge Store, Knowledge Memory

Maintain a dedicated store of what the agent holds to be true about the user and the world, separate from event records (episodic) and learned how-to (procedural).

This pattern helps complete certain larger patterns —

specialisesCross-Session Memory★★— Persist user-specific facts, preferences, and prior context across all sessions, threads, and devices.

Context

An agent operates across many sessions and accumulates durable knowledge: who the user is, what they prefer, what is definitionally true about the domain, what conclusions have settled. This knowledge needs to survive across sessions, be retrievable when relevant, and stay separate from the raw event history that produced it. The team is choosing how this fact layer is represented and queried independently of any single storage technology.

Problem

Without a dedicated semantic store, every fact the agent 'knows' either lives in a static system prompt (frozen, cannot grow with experience) or is re-derived from raw episodes on every turn (slow, lossy, and prone to drift between runs). Mixing facts with raw events also confuses retrieval — 'user prefers dark mode' gets stored as 'on 2026-03-12 the user said: I prefer dark mode' and surfaces only by similarity to that timestamp's wording, not as a stable assertion. The CoALA framework names semantic memory as a distinct long-term type for exactly this reason: the agent needs a layer that holds *what is true*, separately from *what happened* and *how to act*.

Forces

Substrate is a separate choice from function: vector index, knowledge graph, JSON profile, or text can all back semantic memory, with different retrieval and update characteristics.
Facts decay: yesterday's truth ('user is on Pacific time') becomes today's fiction, so invalidation and recency must be explicit.
Conflict resolution: two contradicting assertions must be resolved at write time or read time, not papered over.
Provenance matters: extracted facts can be wrong; the agent must record whether a fact came from the user, was inferred, or was imported, and what episode produced it.

Example

A long-running personal assistant has logged hundreds of conversations with one user. Buried in those logs are durable facts: the user's timezone, their preferred language, their dietary restrictions, the names of their kids, their employer. Treating all of this as episodic recall is wasteful — every time the agent needs the timezone, it would have to semantically retrieve old messages, parse out a date claim, and trust whichever match came up first. The team instead adds a semantic-memory layer: a small extraction step writes assertions like (user, timezone, 'Europe/Berlin', source-episode-id, 2026-04-12) into a profile store. Retrieval at decision time is now a direct lookup, the episode that produced the fact is still recoverable via provenance, and invalidating the timezone when the user moves is one write.

Diagram

flowchart TD Ev[Episode / interaction] --> Ext[Extractor] Ext --> A[Assertion: entity, attribute, value, provenance] A --> Sem[(Semantic memory)] Sem -.substrate.-> V[Vector store] Sem -.substrate.-> KG[Knowledge graph] Sem -.substrate.-> P[JSON profile] Q[Decision: what does the agent know about X?] --> Lookup[Entity/attribute lookup] Lookup --> Sem Lookup --> Out[Fact + provenance] Out --> Ctx[Prepend to context] Sem --> Inv[Invalidate on contradiction]

Solution

Therefore:

The CoALA framework (Sumers et al. 2023) names semantic memory as one of three long-term memory types alongside episodic and procedural, defined by function rather than storage. Implementations vary by substrate: LangMem's semantic channel uses profile (single JSON document) or collection (many documents) stores; knowledge-graph implementations (cognee, Zep) store assertions as typed triples; vector stores can back it when retrieval is by similarity over fact text. The function is the same regardless: extract durable assertions from interactions, store them with entity/attribute keys and provenance, retrieve them when the situation calls for 'what does the agent know about X'. Refer to [[vector-memory]] and [[knowledge-graph-memory]] as substrate options.

What this pattern forbids. Forbids treating raw event records as facts. The semantic layer stores assertions about *what is true*; the episodic layer stores happenings; assertions are written by an explicit extraction or assertion step, not by appending raw events.

The smaller patterns that complete this one —

usesVector Memory★★— Store memories as embeddings in a vector index and retrieve the most semantically similar items at query time.
usesKnowledge Graph Memory★— Persist agent memory as entities and relations in a structured graph so symbolic queries (path, neighbour, type) become possible.

And the patterns that stand alongside it, or against it —

complementsEpisodic Memory★★— Record past events as time-stamped first-person experiences the agent can recall later, separately from extracted facts (semantic) and learned how-to (procedural).
complementsProcedural Memory★— Maintain a third agent memory type alongside episodic (past events) and semantic (facts): procedural memory captures *learned how-to* — reusable skills, workflows, and self-rewritten system instructions that map situations directly to actions.
complementsSelf-Corpus Vocabulary·— Mine a small bounded vocabulary from the agent's own writing and cache it as the conceptual axis for scoring new thoughts, so relevance reflects the agent's actual frame rather than a generic embedding space.
composes-withAgentic Memory★— Expose memory management as first-class tool actions (ADD, UPDATE, DELETE, RETRIEVE, SUMMARY, FILTER) the LLM chooses at every step, trained end-to-end so short-term and long-term memory live under one learned policy.
complementsWorld-Model Graph Memory★— Memory store structured as a typed entity-relation graph used as the agent's authoritative world model for planning — not only for retrieval.
complementsMemory-Type Storage Specialization★★— Use different storage technologies optimized per memory type — fast in-memory stores (Redis-class) for episodic, vector databases (Pinecone/Weaviate) for semantic, relational or workflow engines for procedural — instead of one general store for everything.
complementsThree Layers of Agentic AI Memory★— Architect agent memory as three integrated concentric layers — Short-Term Memory (outer), Long-Term Memory (middle), Feedback Loops (core) — operating together as a unit rather than as separable optional components.