๐ค AI Summary
This work addresses the challenge in long-term personalized dialogue systems arising from the tension between infinite user interactions and limited context windows, which often leads to memory noise accumulation and persona inconsistency. To mitigate this, the authors propose the Inside Out framework, featuring a controllably expandable PersonaTree structure for global user modeling that balances memory compression with consistency preservation. A lightweight MemListener model is introduced, leveraging process-reward reinforcement learning to dynamically perform structured memory operations (ADD, UPDATE, DELETE, or NO_OP). Additionally, a dual-mode response generation mechanism selectively accesses the tree-structured memory as needed. Experimental results demonstrate that PersonaTree outperforms baseline approaches such as full-context concatenation in suppressing noise and maintaining persona consistency, while the compact MemListener achieves memory decision-making performance comparable to or exceeding that of large language models like DeepSeek-R1-0528 and Gemini-3-Pro.
๐ Abstract
Existing long-term personalized dialogue systems struggle to reconcile unbounded interaction streams with finite context constraints, often succumbing to memory noise accumulation, reasoning degradation, and persona inconsistency. To address these challenges, this paper proposes Inside Out, a framework that utilizes a globally maintained PersonaTree as the carrier of long-term user profiling. By constraining the trunk with an initial schema and updating the branches and leaves, PersonaTree enables controllable growth, achieving memory compression while preserving consistency. Moreover, we train a lightweight MemListener via reinforcement learning with process-based rewards to produce structured, executable, and interpretable {ADD, UPDATE, DELETE, NO_OP} operations, thereby supporting the dynamic evolution of the personalized tree. During response generation, PersonaTree is directly leveraged to enhance outputs in latency-sensitive scenarios; when users require more details, the agentic mode is triggered to introduce details on-demand under the constraints of the PersonaTree. Experiments show that PersonaTree outperforms full-text concatenation and various personalized memory systems in suppressing contextual noise and maintaining persona consistency. Notably, the small MemListener model achieves memory-operation decision performance comparable to, or even surpassing, powerful reasoning models such as DeepSeek-R1-0528 and Gemini-3-Pro.