LoRe: Personalizing LLMs via Low-Rank Reward Modeling

📅 2025-04-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Traditional RLHF struggles to capture diverse user preferences using a single reward function. To address this, we propose Low-Rank Reward Modeling (LR-RM), a framework that explicitly models individual preferences via linear combinations of shared low-dimensional basis functions and user-specific weights—bypassing explicit user clustering. LR-RM integrates low-rank matrix decomposition with shared–private reward disentanglement learning, enabling few-shot cross-user generalization. Experiments across multiple preference datasets demonstrate that LR-RM significantly improves reward prediction accuracy for unseen users (average +8.2%), achieves efficient few-shot personalization, and maintains strong generalization capability alongside computational efficiency. Our key contribution is the first introduction of low-rank subspace representation into reward modeling, establishing a scalable and generalizable paradigm for personalized alignment of large language models.

Technology Category

Application Category

📝 Abstract
Personalizing large language models (LLMs) to accommodate diverse user preferences is essential for enhancing alignment and user satisfaction. Traditional reinforcement learning from human feedback (RLHF) approaches often rely on monolithic value representations, limiting their ability to adapt to individual preferences. We introduce a novel framework that leverages low-rank preference modeling to efficiently learn and generalize user-specific reward functions. By representing reward functions in a low-dimensional subspace and modeling individual preferences as weighted combinations of shared basis functions, our approach avoids rigid user categorization while enabling scalability and few-shot adaptation. We validate our method on multiple preference datasets, demonstrating superior generalization to unseen users and improved accuracy in preference prediction tasks.
Problem

Research questions and friction points this paper is trying to address.

Personalizing LLMs for diverse user preferences
Overcoming monolithic value limitations in RLHF
Efficiently learning user-specific reward functions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Low-rank preference modeling for personalization
Shared basis functions for reward representation
Few-shot adaptation to unseen user preferences
🔎 Similar Papers
No similar papers found.