Beyond Fact Retrieval: Episodic Memory for RAG with Generative Semantic Workspaces

📅 2025-11-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Large language models (LLMs) face dual challenges in long-context reasoning: limited context windows and degraded long-range performance. Existing retrieval-augmented generation (RAG) approaches—relying on semantic retrieval or knowledge graphs—prioritize factual recall but fail to capture narrative structures wherein entities evolve temporally and spatially across events. To address this, we propose the Generative Semantic Workspace (GSW), a neuroscience-inspired memory framework that introduces episodic memory mechanisms into RAG for the first time. GSW employs Operators to construct spatiotemporally anchored intermediate semantic representations and a Reconciler to dynamically enforce logical, temporal, and spatial consistency within a generative working memory space. Evaluated on the EpBench benchmark, GSW achieves a 20% accuracy gain over state-of-the-art RAG methods while reducing required context tokens by 51%, significantly lowering inference overhead.

Technology Category

Application Category

📝 Abstract
Large Language Models (LLMs) face fundamental challenges in long-context reasoning: many documents exceed their finite context windows, while performance on texts that do fit degrades with sequence length, necessitating their augmentation with external memory frameworks. Current solutions, which have evolved from retrieval using semantic embeddings to more sophisticated structured knowledge graphs representations for improved sense-making and associativity, are tailored for fact-based retrieval and fail to build the space-time-anchored narrative representations required for tracking entities through episodic events. To bridge this gap, we propose the extbf{Generative Semantic Workspace} (GSW), a neuro-inspired generative memory framework that builds structured, interpretable representations of evolving situations, enabling LLMs to reason over evolving roles, actions, and spatiotemporal contexts. Our framework comprises an extit{Operator}, which maps incoming observations to intermediate semantic structures, and a extit{Reconciler}, which integrates these into a persistent workspace that enforces temporal, spatial, and logical coherence. On the Episodic Memory Benchmark (EpBench) cite{huet_episodic_2025} comprising corpora ranging from 100k to 1M tokens in length, GSW outperforms existing RAG based baselines by up to extbf{20%}. Furthermore, GSW is highly efficient, reducing query-time context tokens by extbf{51%} compared to the next most token-efficient baseline, reducing inference time costs considerably. More broadly, GSW offers a concrete blueprint for endowing LLMs with human-like episodic memory, paving the way for more capable agents that can reason over long horizons.
Problem

Research questions and friction points this paper is trying to address.

LLMs struggle with long-context reasoning and finite context windows
Existing memory frameworks fail to track entities in episodic events
Proposed framework builds structured representations for evolving situations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Generative Semantic Workspace builds structured episodic representations
Operator and Reconciler components ensure spatiotemporal logical coherence
Framework reduces context tokens by 51% improving efficiency
🔎 Similar Papers
No similar papers found.
S
Shreyas Rajesh
University of California, Los Angeles
P
Pavan Holur
University of California, Los Angeles
Chenda Duan
Chenda Duan
Ph.D Student, University of California, Los Angeles
AI for ScienceMultimodal LLMsAutonomous Agents
D
David Chong
University of California, Los Angeles
V
V. Roychowdhury
University of California, Los Angeles