Comparing Moral Values in Western English-speaking societies and LLMs with Word Associations

📅 2025-05-26

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This study investigates systematic discrepancies between mainstream English large language models (LLMs) and Western English-speaking populations in moral value cognition, addressing the susceptibility of direct prompting to response biases. Method: We propose a low-level cognitive representation approach based on word association, embedding Moral Foundations Theory (MFT) seed terms into human and LLM association networks via graph propagation algorithms—enabling cross-subject comparable, quantitative analysis for the first time. Contribution/Results: Using a large-scale LLM-generated word association dataset, we find that while LLMs broadly reproduce the human five-dimensional moral structure, they exhibit significant attenuation in the “Authority/Subversion” and “Purity/Degradation” foundations and display markedly more homogeneous association patterns. This work establishes a novel, cognitively grounded, semantic-graph-based paradigm for interpretable moral representation evaluation and empirically confirms the existence of latent moral misalignment in LLMs.

Technology Category

Application Category

📝 Abstract

As the impact of large language models increases, understanding the moral values they reflect becomes ever more important. Assessing the nature of moral values as understood by these models via direct prompting is challenging due to potential leakage of human norms into model training data, and their sensitivity to prompt formulation. Instead, we propose to use word associations, which have been shown to reflect moral reasoning in humans, as low-level underlying representations to obtain a more robust picture of LLMs' moral reasoning. We study moral differences in associations from western English-speaking communities and LLMs trained predominantly on English data. First, we create a large dataset of LLM-generated word associations, resembling an existing data set of human word associations. Next, we propose a novel method to propagate moral values based on seed words derived from Moral Foundation Theory through the human and LLM-generated association graphs. Finally, we compare the resulting moral conceptualizations, highlighting detailed but systematic differences between moral values emerging from English speakers and LLM associations.

Problem

Research questions and friction points this paper is trying to address.

Assessing moral values in LLMs vs Western societies via word associations

Overcoming human norm leakage in LLM moral value evaluation

Comparing moral conceptualizations between English speakers and LLMs

Innovation

Methods, ideas, or system contributions that make the work stand out.

Using word associations for moral value analysis

Propagating moral values via association graphs

Comparing human and LLM moral conceptualizations

🔎 Similar Papers

A Survey on Moral Foundation Theory and Pre-Trained Language Models: Current Advances and Challenges

2024-09-20AI & SOCIETYCitations: 0

Authors to Follow