🤖 AI Summary
This study investigates fairness in political speech translation by multilingual large language models (LLMs), focusing on disparities in translation quality between mainstream and fringe political parties across 21 languages in the European Parliament corpus. We introduce the first multilingual EuroParl dataset annotated with speaker-level political orientation (left/center/right) and marginality status. Leveraging machine translation evaluation metrics and statistical modeling of party affiliation, we systematically quantify translation bias. Results reveal significantly higher translation accuracy for mainstream parties, while fringe-party speeches suffer systematic quality degradation—uncovering latent political bias in LLMs. Our key contribution is a novel “political–linguistic” dual-dimension fairness evaluation framework, enabling rigorous, reproducible assessment of political neutrality in multilingual AI systems. This work establishes a benchmark for auditing sociopolitical fairness in cross-lingual NLP models.
📝 Abstract
The political biases of Large Language Models (LLMs) are usually assessed by simulating their answers to English surveys. In this work, we propose an alternative framing of political biases, relying on principles of fairness in multilingual translation. We systematically compare the translation quality of speeches in the European Parliament (EP), observing systematic differences with majority parties from left, center, and right being better translated than outsider parties. This study is made possible by a new, 21-way multiparallel version of EuroParl, the parliamentary proceedings of the EP, which includes the political affiliations of each speaker. The dataset consists of 1.5M sentences for a total of 40M words and 249M characters. It covers three years, 1000+ speakers, 7 countries, 12 EU parties, 25 EU committees, and hundreds of national parties.