When Do Language Models Endorse Limitations on Human Rights Principles?

📅 2026-03-04

📈 Citations: 0

✨ Influential: 0

career value

183K/year

🤖 AI Summary

This study addresses the urgent need to systematically evaluate the human rights compliance of large language models (LLMs) in high-stakes human–AI interactions, where they may inadvertently violate principles enshrined in the Universal Declaration of Human Rights. The authors construct a novel benchmark comprising 1,152 synthetically generated scenarios spanning 24 human rights provisions across eight languages, enabling the first cross-lingual, multidimensional quantitative analysis of mainstream LLMs’ behavioral tendencies in human rights trade-offs. By integrating synthetic data generation, multilingual prompt engineering, and a mixed-method evaluation combining Likert-scale ratings with open-ended responses, the study reveals that models are more permissive toward restrictions on economic, social, and cultural rights; exhibit significantly stronger restriction tendencies in Chinese and Hindi compared to English and Romanian; and demonstrate high sensitivity to prompt formulation. This work establishes a scalable, cross-cultural framework for AI ethics assessment.

Technology Category

Application Category

📝 Abstract

As Large Language Models (LLMs) increasingly mediate global information access with the potential to shape public discourse, their alignment with universal human rights principles becomes important to ensure that these rights are abided by in high stakes AI-mediated interactions. In this paper, we evaluate how LLMs navigate trade-offs involving the Universal Declaration of Human Rights (UDHR), leveraging 1,152 synthetically generated scenarios across 24 rights articles and eight languages. Our analysis of eleven major LLMs reveals systematic biases where models: (1) accept limiting Economic, Social, and Cultural rights more often than Political and Civil rights, (2) demonstrate significant cross-linguistic variation with elevated endorsement rates of rights-limiting actions in Chinese and Hindi compared to English or Romanian, (3) show substantial susceptibility to prompt-based steering, and (4) exhibit noticeable differences between Likert and open-ended responses, highlighting critical challenges in LLM preference assessment.

Problem

Research questions and friction points this paper is trying to address.

Large Language Models

Human Rights

Universal Declaration of Human Rights

AI Alignment

Rights Limitation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Large Language Models

Human Rights Alignment

Cross-linguistic Bias