Beyond English: Evaluating Automated Measurement of Moral Foundations in Non-English Discourse with a Chinese Case Study

📅 2025-02-04

📈 Citations: 0

✨ Influential: 0

career value

185K/year

🤖 AI Summary

This study addresses the dominance of English resources and insufficient cross-lingual adaptation in Moral Foundations Theory (MFT) measurement. It systematically evaluates machine translation, bilingual lexicons, multilingual language models (mBERT/XLM-R), and large language models (LLMs; e.g., Qwen, ChatGLM) for automated MFT identification in Chinese. As the first work to conduct a cross-method, cross-lingual performance comparison, it reveals that translation-based and lexicon-based approaches incur 47% and 39% cultural semantic loss, respectively. LLMs achieve a 32% accuracy gain over baselines while requiring only one-fifth of annotated data, demonstrating superior cultural fidelity and data efficiency. The study underscores the necessity of human-in-the-loop validation and establishes a reproducible methodological framework and empirical benchmark for computational MFT measurement in non-English contexts.

Technology Category

Application Category

📝 Abstract

This study explores computational approaches for measuring moral foundations (MFs) in non-English corpora. Since most resources are developed primarily for English, cross-linguistic applications of moral foundation theory remain limited. Using Chinese as a case study, this paper evaluates the effectiveness of applying English resources to machine translated text, local language lexicons, multilingual language models, and large language models (LLMs) in measuring MFs in non-English texts. The results indicate that machine translation and local lexicon approaches are insufficient for complex moral assessments, frequently resulting in a substantial loss of cultural information. In contrast, multilingual models and LLMs demonstrate reliable cross-language performance with transfer learning, with LLMs excelling in terms of data efficiency. Importantly, this study also underscores the need for human-in-the-loop validation of automated MF assessment, as the most advanced models may overlook cultural nuances in cross-language measurements. The findings highlight the potential of LLMs for cross-language MF measurements and other complex multilingual deductive coding tasks.

Problem

Research questions and friction points this paper is trying to address.

Measuring moral foundations in non-English texts

Evaluating computational methods for Chinese discourse

Assessing cross-language performance of language models

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses multilingual models for cross-language tasks

Applies large language models efficiently

Incorporates human-in-the-loop validation

🔎 Similar Papers

A Survey on Moral Foundation Theory and Pre-Trained Language Models: Current Advances and Challenges