LoRe: Personalizing LLMs via Low-Rank Reward Modeling

📅 2025-04-20

📈 Citations: 0

✨ Influential: 0

career value

200K/year

🤖 AI Summary

Traditional RLHF struggles to capture diverse user preferences using a single reward function. To address this, we propose Low-Rank Reward Modeling (LR-RM), a framework that explicitly models individual preferences via linear combinations of shared low-dimensional basis functions and user-specific weights—bypassing explicit user clustering. LR-RM integrates low-rank matrix decomposition with shared–private reward disentanglement learning, enabling few-shot cross-user generalization. Experiments across multiple preference datasets demonstrate that LR-RM significantly improves reward prediction accuracy for unseen users (average +8.2%), achieves efficient few-shot personalization, and maintains strong generalization capability alongside computational efficiency. Our key contribution is the first introduction of low-rank subspace representation into reward modeling, establishing a scalable and generalizable paradigm for personalized alignment of large language models.

Technology Category

Application Category

📝 Abstract

Personalizing large language models (LLMs) to accommodate diverse user preferences is essential for enhancing alignment and user satisfaction. Traditional reinforcement learning from human feedback (RLHF) approaches often rely on monolithic value representations, limiting their ability to adapt to individual preferences. We introduce a novel framework that leverages low-rank preference modeling to efficiently learn and generalize user-specific reward functions. By representing reward functions in a low-dimensional subspace and modeling individual preferences as weighted combinations of shared basis functions, our approach avoids rigid user categorization while enabling scalability and few-shot adaptation. We validate our method on multiple preference datasets, demonstrating superior generalization to unseen users and improved accuracy in preference prediction tasks.

Problem

Research questions and friction points this paper is trying to address.

Personalizing LLMs for diverse user preferences

Overcoming monolithic value limitations in RLHF

Efficiently learning user-specific reward functions

Innovation

Methods, ideas, or system contributions that make the work stand out.

Low-rank preference modeling for personalization

Shared basis functions for reward representation

Few-shot adaptation to unseen user preferences

🔎 Similar Papers

No similar papers found.

OpenAI

$380K – $445K • Offers Equity

San Francisco, CA, USA

Research Engineer, Language - Personalization, Meta Superintelligence Labs