LLM Ethics Benchmark: A Three-Dimensional Assessment System for Evaluating Moral Reasoning in Large Language Models

📅 2025-05-01

📈 Citations: 0

✨ Influential: 0

career value

196K/year

🤖 AI Summary

Existing LLM evaluation methods for moral reasoning lack precision and fail to characterize model behavior in complex ethical decision-making, leading to accountability gaps in societal deployment. This paper introduces the first three-dimensional moral reasoning evaluation framework, systematically assessing models along three orthogonal dimensions: adherence to foundational ethical principles, reasoning robustness, and cross-scenario value consistency. We propose a novel decoupled evaluation paradigm integrating multi-round adversarial prompting, cross-cultural scenario sampling, quantitative consistency metrics, and robustness stress testing—enabling fine-grained weakness localization and interpretable attribution. Benchmarking across 12 mainstream LLMs, we uncover, for the first time, a pervasive “principle–practice gap” wherein models articulate ethical principles correctly but fail to apply them consistently in practice. We publicly release our dataset and toolchain, which have already been adopted by five AI ethics research laboratories.

Technology Category

Application Category

📝 Abstract

This study establishes a novel framework for systematically evaluating the moral reasoning capabilities of large language models (LLMs) as they increasingly integrate into critical societal domains. Current assessment methodologies lack the precision needed to evaluate nuanced ethical decision-making in AI systems, creating significant accountability gaps. Our framework addresses this challenge by quantifying alignment with human ethical standards through three dimensions: foundational moral principles, reasoning robustness, and value consistency across diverse scenarios. This approach enables precise identification of ethical strengths and weaknesses in LLMs, facilitating targeted improvements and stronger alignment with societal values. To promote transparency and collaborative advancement in ethical AI development, we are publicly releasing both our benchmark datasets and evaluation codebase at https://github.com/ The-Responsible-AI-Initiative/LLM_Ethics_Benchmark.git.

Problem

Research questions and friction points this paper is trying to address.

Evaluating moral reasoning in large language models

Addressing gaps in AI ethical decision-making assessments

Quantifying alignment with human ethical standards

Innovation

Methods, ideas, or system contributions that make the work stand out.

Three-dimensional framework for moral reasoning evaluation

Quantifies alignment with human ethical standards

Publicly releases benchmark datasets and codebase

🔎 Similar Papers

A Survey on Moral Foundation Theory and Pre-Trained Language Models: Current Advances and Challenges

2024-09-20AI & SOCIETYCitations: 0