How Does Personalized Memory Shape LLM Behavior? Benchmarking Rational Preference Utilization in Personalized Assistants

📅 2026-01-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the problem of irrational personalization in large language models, where the incorporation of personalized memory often introduces irrelevant information, thereby distorting user intent understanding and degrading user experience. To tackle this issue, the study presents the first systematic characterization of the phenomenon and proposes RP-Reasoner, a novel approach that models memory utilization as a pragmatic reasoning process to enable selective integration of personalized information. Furthermore, the authors introduce RPEval, the first benchmark specifically designed for evaluating rational preference utilization, comprising a personalized intent inference dataset and a multi-granularity evaluation protocol. Experimental results demonstrate that RP-Reasoner significantly outperforms strong baselines and successfully rectifies 80% of irrational personalization cases in a large-scale commercial assistant.

Technology Category

Application Category

📝 Abstract
Large language model (LLM)-powered assistants have recently integrated memory mechanisms that record user preferences, leading to more personalized and user-aligned responses. However, irrelevant personalized memories are often introduced into the context, interfering with the LLM's intent understanding. To comprehensively investigate the dual effects of personalization, we develop RPEval, a benchmark comprising a personalized intent reasoning dataset and a multi-granularity evaluation protocol. RPEval reveals the widespread phenomenon of irrational personalization in existing LLMs and, through error pattern analysis, illustrates its negative impact on user experience. Finally, we introduce RP-Reasoner, which treats memory utilization as a pragmatic reasoning process, enabling the selective integration of personalized information. Experimental results demonstrate that our method significantly outperforms carefully designed baselines on RPEval, and resolves 80% of the bad cases observed in a large-scale commercial personalized assistant, highlighting the potential of pragmatic reasoning to mitigate irrational personalization. Our benchmark is publicly available at https://github.com/XueyangFeng/RPEval.
Problem

Research questions and friction points this paper is trying to address.

personalized memory
irrational personalization
intent understanding
large language models
user preferences
Innovation

Methods, ideas, or system contributions that make the work stand out.

personalized memory
rational preference utilization
pragmatic reasoning
LLM personalization
benchmarking
🔎 Similar Papers
No similar papers found.
Xueyang Feng
Xueyang Feng
Renming University of China
LLM
Weinan Gan
Weinan Gan
Huawei Noah's Ark Lab
Large Language ModelGenerative IRAgent
X
Xu Chen
Gaoling School of Artificial Intelligence, Renmin University of China, Beijing, China
Q
Quanyu Dai
Huawei Technologies Ltd., Shenzhen, China
Yong Liu
Yong Liu
Huawei, NTU, I2R
Recommender SystemsData MiningMachine Learning