The Staircase of Ethics: Probing LLM Value Priorities through Multi-Step Induction to Complex Moral Dilemmas

📅 2025-05-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing ethical evaluations of large language models (LLMs) rely predominantly on single-step, static judgments, failing to capture how LLMs dynamically adapt their value priorities when navigating evolving moral dilemmas. Method: We introduce the first five-stage, 3,302-item Multi-Step Moral Dilemmas (MMDs) dataset and propose a multi-step inductive moral evaluation paradigm, integrating fine-grained value annotation, pairwise value comparison, and cross-model consistency agreement. Contribution/Results: Evaluating nine mainstream LLMs reveals significant shifts in moral judgments across steps; while care consistently ranks highest overall, fairness surpasses it in higher-order dilemmas—demonstrating strong contextual dependence and value-priority reversal. This work pioneers a paradigm shift from static to dynamic ethical assessment for LLMs.

Technology Category

Application Category

📝 Abstract
Ethical decision-making is a critical aspect of human judgment, and the growing use of LLMs in decision-support systems necessitates a rigorous evaluation of their moral reasoning capabilities. However, existing assessments primarily rely on single-step evaluations, failing to capture how models adapt to evolving ethical challenges. Addressing this gap, we introduce the Multi-step Moral Dilemmas (MMDs), the first dataset specifically constructed to evaluate the evolving moral judgments of LLMs across 3,302 five-stage dilemmas. This framework enables a fine-grained, dynamic analysis of how LLMs adjust their moral reasoning across escalating dilemmas. Our evaluation of nine widely used LLMs reveals that their value preferences shift significantly as dilemmas progress, indicating that models recalibrate moral judgments based on scenario complexity. Furthermore, pairwise value comparisons demonstrate that while LLMs often prioritize the value of care, this value can sometimes be superseded by fairness in certain contexts, highlighting the dynamic and context-dependent nature of LLM ethical reasoning. Our findings call for a shift toward dynamic, context-aware evaluation paradigms, paving the way for more human-aligned and value-sensitive development of LLMs.
Problem

Research questions and friction points this paper is trying to address.

Evaluating LLM moral reasoning in multi-step dilemmas
Assessing dynamic value shifts in LLM ethical judgments
Analyzing context-dependent value priorities in LLM decision-making
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-step Moral Dilemmas dataset for LLMs
Dynamic analysis of evolving moral judgments
Context-aware evaluation of LLM value priorities
🔎 Similar Papers
No similar papers found.
Y
Ya Wu
Media Synthesis and Forensics Lab, Institute of Computing Technology, Chinese Academy of Sciences; University of Chinese Academy of Sciences
Qiang Sheng
Qiang Sheng
Chinese Academy of Sciences
fake news detectionfact checkingLLM safety
Danding Wang
Danding Wang
Institute of Computing Technology, Chinese Academy of Sciences
Explainable AIMedia ForensicsHuman-Computer Interaction
G
Guang Yang
Zhongguancun Laboratory
Y
Yifan Sun
Media Synthesis and Forensics Lab, Institute of Computing Technology, Chinese Academy of Sciences; University of Chinese Academy of Sciences
Z
Zhengjia Wang
Media Synthesis and Forensics Lab, Institute of Computing Technology, Chinese Academy of Sciences; University of Chinese Academy of Sciences
Y
Yuyan Bu
Media Synthesis and Forensics Lab, Institute of Computing Technology, Chinese Academy of Sciences; University of Chinese Academy of Sciences
Juan Cao
Juan Cao
Professor of Mathematics, Xiamen University
Computer Aided Geometric DesignComputer Graphics