Implicit Graph, Explicit Retrieval: Towards Efficient and Interpretable Long-horizon Memory for Large Language Models

📅 2026-01-06

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

217K/year

🤖 AI Summary

This work addresses the challenge that large language models struggle to effectively memorize and retrieve sparsely distributed evidence over long-horizon tasks, where explicit memory mechanisms are prone to interference from lengthy contexts and implicit memory lacks interpretability. To overcome this, the authors propose LatentGraphMem, a framework that implicitly encodes graph-structured memory into a latent space to ensure efficiency and stability, while introducing a task-driven subgraph retrieval interface that explicitly returns compact symbolic subgraphs for reasoning and human verification. By integrating implicit storage with explicit retrieval, the method enables parameter-efficient adaptation through supervised training with a frozen reasoner and scales gracefully to larger models. Experiments demonstrate that LatentGraphMem significantly outperforms both explicit graph-based and implicit memory baselines across multiple model scales and long-horizon benchmarks, achieving efficient, interpretable, and scalable memory augmentation.

Technology Category

Application Category

📝 Abstract

Long-horizon applications increasingly require large language models (LLMs) to answer queries when relevant evidence is sparse and dispersed across very long contexts. Existing memory systems largely follow two paradigms: explicit structured memories offer interpretability but often become brittle under long-context overload, while latent memory mechanisms are efficient and stable yet difficult to inspect. We propose LatentGraphMem, a memory framework that combines implicit graph memory with explicit subgraph retrieval. LatentGraphMem stores a graph-structured memory in latent space for stability and efficiency, and exposes a task-specific subgraph retrieval interface that returns a compact symbolic subgraph under a fixed budget for downstream reasoning and human inspection. During training, an explicit graph view is materialized to interface with a frozen reasoner for question-answering supervision. At inference time, retrieval is performed in latent space and only the retrieved subgraph is externalized. Experiments on long-horizon benchmarks across multiple model scales show that LatentGraphMem consistently outperforms representative explicit-graph and latent-memory baselines, while enabling parameter-efficient adaptation and flexible scaling to larger reasoners without introducing large symbolic artifacts.

Problem

Research questions and friction points this paper is trying to address.

long-horizon memory

large language models

interpretability

efficient retrieval

graph-structured memory

Innovation

Methods, ideas, or system contributions that make the work stand out.

LatentGraphMem

implicit graph memory

explicit subgraph retrieval