MemFly: On-the-Fly Memory Optimization via Information Bottleneck

📅 2026-02-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the fundamental tension between information compression and precise retrieval in long-term memory for large language models by proposing the first online memory evolution framework grounded in information bottleneck theory. The framework employs a gradient-free optimizer to dynamically compress memory during inference, minimizing redundant entropy while maximizing task relevance, and organizes memory into a hierarchical structure. To support complex multi-hop queries, it introduces a hybrid retrieval mechanism that integrates semantic, symbolic, and topological pathways, enabling iterative refinement of retrieved content. Experimental results demonstrate that the proposed approach significantly outperforms state-of-the-art baselines in memory coherence, response fidelity, and task accuracy.

Technology Category

Application Category

📝 Abstract
Long-term memory enables large language model agents to tackle complex tasks through historical interactions. However, existing frameworks encounter a fundamental dilemma between compressing redundant information efficiently and maintaining precise retrieval for downstream tasks. To bridge this gap, we propose MemFly, a framework grounded in information bottleneck principles that facilitates on-the-fly memory evolution for LLMs. Our approach minimizes compression entropy while maximizing relevance entropy via a gradient-free optimizer, constructing a stratified memory structure for efficient storage. To fully leverage MemFly, we develop a hybrid retrieval mechanism that seamlessly integrates semantic, symbolic, and topological pathways, incorporating iterative refinement to handle complex multi-hop queries. Comprehensive experiments demonstrate that MemFly substantially outperforms state-of-the-art baselines in memory coherence, response fidelity, and accuracy.
Problem

Research questions and friction points this paper is trying to address.

long-term memory
information compression
retrieval accuracy
memory optimization
large language models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Information Bottleneck
On-the-Fly Memory Optimization
Gradient-Free Optimization
Hybrid Retrieval
Memory Coherence
🔎 Similar Papers
No similar papers found.