SimpleMem: Efficient Lifelong Memory for LLM Agents

📅 2026-01-05

🏛️ arXiv.org

📈 Citations: 5

✨ Influential: 0

🤖 AI Summary

This work addresses the challenge of inefficient historical experience utilization in large language model (LLM) agents during prolonged, complex interactions, where redundant memory or excessive reasoning overhead often impede performance. To tackle this, the authors propose a semantically lossless, high-efficiency memory framework that enhances memory density while preserving information fidelity through a three-stage mechanism: structured compression, recursive integration, and query-aware adaptive retrieval. Key innovations include entropy-aware filtering, multi-perspective indexed memory units, asynchronous recursive abstraction, and a dynamic retrieval strategy driven by query complexity. Experimental results demonstrate that the proposed approach achieves an average F1 score improvement of 26.4% on benchmark tasks and reduces token consumption during inference by up to 30×, substantially outperforming existing methods.

Technology Category

Application Category

📝 Abstract

To support long-term interaction in complex environments, LLM agents require memory systems that manage historical experiences. Existing approaches either retain full interaction histories via passive context extension, leading to substantial redundancy, or rely on iterative reasoning to filter noise, incurring high token costs. To address this challenge, we introduce SimpleMem, an efficient memory framework based on semantic lossless compression. We propose a three-stage pipeline designed to maximize information density and token utilization: (1) Semantic Structured Compression, which distills unstructured interactions into compact, multi-view indexed memory units; (2) Online Semantic Synthesis, an intra-session process that instantly integrates related context into unified abstract representations to eliminate redundancy; and (3) Intent-Aware Retrieval Planning, which infers search intent to dynamically determine retrieval scope and construct precise context efficiently. Experiments on benchmark datasets show that our method consistently outperforms baseline approaches in accuracy, retrieval efficiency, and inference cost, achieving an average F1 improvement of 26.4% in LoCoMo while reducing inference-time token consumption by up to 30-fold, demonstrating a superior balance between performance and efficiency. Code is available at https://github.com/aiming-lab/SimpleMem.

Problem

Research questions and friction points this paper is trying to address.

lifelong memory

LLM agents

memory efficiency

token cost

historical experience management

Innovation

Methods, ideas, or system contributions that make the work stand out.

Semantic Compression

Memory Consolidation

Adaptive Retrieval

LLM Agents

Token Efficiency

🔎 Similar Papers

No similar papers found.

Authors to Follow