🤖 AI Summary
Existing LLM personalization methods heavily rely on manual feedback or interaction logs, suffering from poor scalability and difficulty in modeling deep user attributes—such as values, personality traits, and cultural orientations. To address this, we propose GRAVITY, the first framework integrating Hofstede’s cultural dimensions, Schwartz’s theory of basic human values, the World Values Survey, and the Five-Factor Model of personality to automatically generate contextualized, multidimensional user preference contrast data. This enables both supervised fine-tuning and prompt optimization while substantially reducing dependence on human annotation. Moreover, GRAVITY enhances cross-cultural adaptability through theoretically grounded, culturally aware preference modeling. Empirical evaluation with 400 Amazon Mechanical Turk participants demonstrates a >4% improvement in cross-cultural preference prediction accuracy and shows that GRAVITY-generated outputs are preferred by users in over 86% of evaluated scenarios.
📝 Abstract
Personalization in LLMs often relies on costly human feedback or interaction logs, limiting scalability and neglecting deeper user attributes. To reduce the reliance on human annotations, we introduce GRAVITY (Generative Response with Aligned Values, Interests, and Traits of You), a framework for generating synthetic, profile-grounded preference data that captures users' interests, values, beliefs, and personality traits. By integrating demographic, cultural, and psychological frameworks -- including Hofstede's cultural dimensions, Schwartz's basic values, the World Values Survey, and Big Five OCEAN traits -- GRAVITY synthesizes preference pairs to guide personalized content generation. We evaluate GRAVITY on book descriptions for 400 Amazon users, comparing it to prompt-based conditioning, standard fine-tuning, and naive synthetic pair generation. Profile-grounded synthetic data consistently improves generation, especially across multiple cultures (USA, Brazil, Japan, India), achieving over 4% higher preference gains across baselines, with user studies showing that GRAVITY outputs are preferred over 86% of the time. Our results show that scenario-grounded synthetic data can capture richer user variation, reduce reliance on costly annotation, and produce more engaging, user-centered content, offering a scalable path for LLM personalization.