Can Language Models Reason about Individualistic Human Values and Preferences?

📅 2024-10-04

🏛️ arXiv.org

📈 Citations: 12

✨ Influential: 1

career value

217K/year

🤖 AI Summary

Existing value-alignment approaches rely on demographic group labels, perpetuating stereotypes and obscuring individual variation in human values and preferences. Method: We propose “individualized alignment”—a paradigm shift toward fine-grained reasoning about personal values—and introduce IndieValueCatalog, the first benchmark for individual value reasoning, derived from the World Values Survey. It requires models to predict an individual’s judgments in novel scenarios based solely on their self-reported value statements. We design the Value Inequity Index to quantify systematic model bias and establish a fine-grained consistency evaluation protocol. Models (IndieValueReasoner) are trained via supervised fine-tuning and prompt engineering. Contribution/Results: Empirical evaluation reveals that state-of-the-art language models achieve only 55–65% accuracy on individual-level prediction; demographic features fail to substitute for personalized value representations; and significant global systematic biases persist across regions and value dimensions.

Technology Category

Application Category

📝 Abstract

Recent calls for pluralistic alignment emphasize that AI systems should address the diverse needs of all people. Yet, efforts in this space often require sorting people into fixed buckets of pre-specified diversity-defining dimensions (e.g., demographics), risking smoothing out individualistic variations or even stereotyping. To achieve an authentic representation of diversity that respects individuality, we propose individualistic alignment. While individualistic alignment can take various forms, we introduce IndieValueCatalog, a dataset transformed from the influential World Values Survey (WVS), to study language models (LMs) on the specific challenge of individualistic value reasoning. Given a sample of an individual's value-expressing statements, models are tasked with predicting this person's value judgments in novel cases. With IndieValueCatalog, we reveal critical limitations in frontier LMs, which achieve only 55 % to 65% accuracy in predicting individualistic values. Moreover, our results highlight that a precise description of individualistic values cannot be approximated only with demographic information. We also identify a partiality of LMs in reasoning about global individualistic values, as measured by our proposed Value Inequity Index ({sigma}Inequity). Finally, we train a series of IndieValueReasoners to reveal new patterns and dynamics into global human values.

Problem

Research questions and friction points this paper is trying to address.

Study LMs' ability to predict individualistic human values

Reveal limitations of LMs in value reasoning accuracy

Assess demographic info inadequacy for precise value description

Innovation

Methods, ideas, or system contributions that make the work stand out.

Propose individualistic alignment for authentic diversity representation

Introduce IndieValueCatalog dataset for value reasoning challenge

Train IndieValueReasoners to analyze global human values

🔎 Similar Papers

A Survey on Moral Foundation Theory and Pre-Trained Language Models: Current Advances and Challenges