🤖 AI Summary
Existing LLM personalization methods heavily rely on user historical data while neglecting deep inter-user heterogeneity; moreover, feature extraction is constrained by fixed-dimensional representations and intuitive (System-1) reasoning, resulting in coarse-grained and narrow user modeling. To address these limitations, we propose the Difference-Aware Reasoning Personalization (DRP) framework: it employs a reasoning expansion mechanism to autonomously discover variable-dimensional differential features; integrates System-2 slow reasoning for structured, fine-grained user representation; and synthesizes behavioral history into explicit differential descriptions. Evaluated on personalized review generation, DRP significantly outperforms state-of-the-art baselines across diversity, relevance, and personalization metrics. Results demonstrate that difference-driven slow reasoning effectively enhances generative quality, validating the efficacy of explicitly modeling user heterogeneity through deliberate, reflective inference.
📝 Abstract
Large Language Models (LLMs) are increasingly integrated into users'daily lives, driving a growing demand for personalized outputs. Prior work has primarily leveraged a user's own history, often overlooking inter-user differences that are critical for effective personalization. While recent methods have attempted to model such differences, their feature extraction processes typically rely on fixed dimensions and quick, intuitive inference (System-1 thinking), limiting both the coverage and granularity of captured user differences. To address these limitations, we propose Difference-aware Reasoning Personalization (DRP), a framework that reconstructs the difference extraction mechanism by leveraging inference scaling to enhance LLM personalization. DRP autonomously identifies relevant difference feature dimensions and generates structured definitions and descriptions, enabling slow, deliberate reasoning (System-2 thinking) over user differences. Experiments on personalized review generation demonstrate that DRP consistently outperforms baseline methods across multiple metrics.