PersonaDual: Balancing Personalization and Objectivity via Adaptive Reasoning

📅 2026-01-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the inherent tension in large language models between generating personalized responses and maintaining factual accuracy, where excessive personalization often compromises response correctness. To reconcile this conflict, the authors propose PersonaDual, a novel framework that unifies general objective reasoning and personalized reasoning within a single model for the first time. PersonaDual employs supervised fine-tuning to train dual reasoning pathways and introduces DualGRPO, a new reinforcement learning algorithm, to optimize an adaptive mechanism that dynamically selects the optimal reasoning path. Experimental results demonstrate that PersonaDual significantly mitigates interference between objective and personalized tasks, approaching the performance upper bound achievable under no interference, while effectively leveraging beneficial personalization signals to enhance factual question answering.

Technology Category

Application Category

📝 Abstract
As users increasingly expect LLMs to align with their preferences, personalized information becomes valuable. However, personalized information can be a double-edged sword: it can improve interaction but may compromise objectivity and factual correctness, especially when it is misaligned with the question. To alleviate this problem, we propose PersonaDual, a framework that supports both general-purpose objective reasoning and personalized reasoning in a single model, and adaptively switches modes based on context. PersonaDual is first trained with SFT to learn two reasoning patterns, and then further optimized via reinforcement learning with our proposed DualGRPO to improve mode selection. Experiments on objective and personalized benchmarks show that PersonaDual preserves the benefits of personalization while reducing interference, achieving near interference-free performance and better leveraging helpful personalized signals to improve objective problem-solving.
Problem

Research questions and friction points this paper is trying to address.

personalization
objectivity
large language models
reasoning
factual correctness
Innovation

Methods, ideas, or system contributions that make the work stand out.

adaptive reasoning
personalization-objectivity balance
DualGRPO
mode switching
large language models
X
Xiaoyou Liu
Fudan University
Xinyi Mou
Xinyi Mou
Fudan University
NLPLarge Language ModelsSocial Simulation
S
Shengbin Yue
Fudan University
L
Liang Wang
Fudan University
Y
Yuqing Wang
OPPO
Q
Qiexiang Wang
OPPO
Tianrui Qin
Tianrui Qin
OPPO
Agentic AIDeep LearningLLM Security
Wangchunshu Zhou
Wangchunshu Zhou
OPPO & M-A-P
artificial general intelligencelanguage agentslarge language modelsnatural language processing
Z
Zhongyu Wei
Fudan University, Shanghai Innovation Institute