The Pluralistic Moral Gap: Understanding Judgment and Value Differences between Humans and Large Language Models

📅 2025-07-23

📈 Citations: 0

✨ Influential: 0

career value

200K/year

🤖 AI Summary

This study identifies systematic misalignment between large language models (LLMs) and human moral judgment: LLMs align with humans only in high-consensus scenarios and exhibit severe deficits in value diversity. To address this, we introduce a benchmark dataset of 1,618 realistic moral dilemmas and propose Dynamic Moral Profiling (DMP)—the first method to employ Dirichlet sampling for conditional generation grounded in empirically observed human value distributions. DMP integrates distributional alignment evaluation, a 60-dimensional value taxonomy, and free-text rationale extraction to enable multi-granular moral modeling. Experiments demonstrate that DMP improves alignment between model and human moral judgment distributions by 64.3%, significantly enhancing both the breadth and diversity of value coverage in model outputs. This work establishes a scalable, interpretable paradigm for value alignment in foundation models.

Technology Category

Application Category

📝 Abstract

People increasingly rely on Large Language Models (LLMs) for moral advice, which may influence humans' decisions. Yet, little is known about how closely LLMs align with human moral judgments. To address this, we introduce the Moral Dilemma Dataset, a benchmark of 1,618 real-world moral dilemmas paired with a distribution of human moral judgments consisting of a binary evaluation and a free-text rationale. We treat this problem as a pluralistic distributional alignment task, comparing the distributions of LLM and human judgments across dilemmas. We find that models reproduce human judgments only under high consensus; alignment deteriorates sharply when human disagreement increases. In parallel, using a 60-value taxonomy built from 3,783 value expressions extracted from rationales, we show that LLMs rely on a narrower set of moral values than humans. These findings reveal a pluralistic moral gap: a mismatch in both the distribution and diversity of values expressed. To close this gap, we introduce Dynamic Moral Profiling (DMP), a Dirichlet-based sampling method that conditions model outputs on human-derived value profiles. DMP improves alignment by 64.3% and enhances value diversity, offering a step toward more pluralistic and human-aligned moral guidance from LLMs.

Problem

Research questions and friction points this paper is trying to address.

Assess alignment between LLM and human moral judgments

Identify narrower value diversity in LLMs versus humans

Propose Dynamic Moral Profiling to bridge moral gap

Innovation

Methods, ideas, or system contributions that make the work stand out.

Moral Dilemma Dataset with human judgments

Dirichlet-based Dynamic Moral Profiling method

Improves alignment and value diversity

🔎 Similar Papers

A Survey on Moral Foundation Theory and Pre-Trained Language Models: Current Advances and Challenges