๐ค AI Summary
This work addresses the limited ability of current large language models to recognize deep-seated values in cross-lingual content safety evaluation and their lack of systematic assessment across diverse global value systems. To bridge this gap, we introduce X-Value, the first large-scale cross-lingual benchmark for value alignment evaluation, comprising over 5,000 question-answer pairs across 18 languages. Grounded in Schwartzโs theory of basic human values, X-Value organizes value dimensions into seven core categories and features a novel two-stage human annotation framework that integrates both global consensus and cultural pluralism. Experimental results reveal that state-of-the-art models achieve less than 77% accuracy on X-Value, with cross-lingual performance gaps exceeding 20%, highlighting significant limitations in their capacity to understand values across cultures.
๐ Abstract
While large language models (LLMs) have become pivotal to content safety, current evaluation paradigms primarily focus on detecting explicit harms (e.g., violence or hate speech), neglecting the subtler value dimensions conveyed in digital content. To bridge this gap, we introduce X-Value, a novel Cross-lingual Values Assessment Benchmark designed to evaluate LLMs' ability to assess deep-level values of content from a global perspective. X-Value consists of more than 5,000 QA pairs across 18 languages, systematically organized into 7 core domains grounded in Schwartz's Theory of Basic Human Values and categorized into easy and hard levels for discriminative evaluation. We further propose a unique two-stage annotation framework that first identifies whether an issue falls under global consensus (e.g., human rights) or pluralism (e.g., religion), and subsequently conducts a multi-party evaluation of the latent values embedded within the content. Systematic evaluations on X-Value reveal that current SOTA LLMs exhibit deficiencies in cross-lingual values assessment ($Acc < 77\%$), with significant performance disparities across different languages ($ฮAcc > 20\%$). This work highlights the urgent need to improve the nuanced, values-aware content assessment capability of LLMs. Our X-Value is available at: https://huggingface.co/datasets/Whitolf/X-Value.