🤖 AI Summary
This work addresses the challenge that large language models struggle to effectively memorize and retrieve sparsely distributed evidence over long-horizon tasks, where explicit memory mechanisms are prone to interference from lengthy contexts and implicit memory lacks interpretability. To overcome this, the authors propose LatentGraphMem, a framework that implicitly encodes graph-structured memory into a latent space to ensure efficiency and stability, while introducing a task-driven subgraph retrieval interface that explicitly returns compact symbolic subgraphs for reasoning and human verification. By integrating implicit storage with explicit retrieval, the method enables parameter-efficient adaptation through supervised training with a frozen reasoner and scales gracefully to larger models. Experiments demonstrate that LatentGraphMem significantly outperforms both explicit graph-based and implicit memory baselines across multiple model scales and long-horizon benchmarks, achieving efficient, interpretable, and scalable memory augmentation.
📝 Abstract
Long-horizon applications increasingly require large language models (LLMs) to answer queries when relevant evidence is sparse and dispersed across very long contexts. Existing memory systems largely follow two paradigms: explicit structured memories offer interpretability but often become brittle under long-context overload, while latent memory mechanisms are efficient and stable yet difficult to inspect. We propose LatentGraphMem, a memory framework that combines implicit graph memory with explicit subgraph retrieval. LatentGraphMem stores a graph-structured memory in latent space for stability and efficiency, and exposes a task-specific subgraph retrieval interface that returns a compact symbolic subgraph under a fixed budget for downstream reasoning and human inspection. During training, an explicit graph view is materialized to interface with a frozen reasoner for question-answering supervision. At inference time, retrieval is performed in latent space and only the retrieved subgraph is externalized. Experiments on long-horizon benchmarks across multiple model scales show that LatentGraphMem consistently outperforms representative explicit-graph and latent-memory baselines, while enabling parameter-efficient adaptation and flexible scaling to larger reasoners without introducing large symbolic artifacts.