PartisanLens: A Multilingual Dataset of Hyperpartisan and Conspiratorial Immigration Narratives in European Media

📅 2026-01-07

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

191K/year

🤖 AI Summary

This study addresses the paucity of systematic analyses of extreme partisanship and the “great replacement” conspiracy theory in European multilingual political discourse, as existing research has predominantly focused on English-language contexts. The authors construct the first multilingual dataset comprising 1,617 news headlines in Spanish, Italian, and Portuguese, annotated across multiple dimensions of political rhetoric. For the first time, they integrate socioeconomic and ideological profiles to guide large language models (LLMs) in simulating human annotation behavior from diverse political standpoints. By combining human-annotated benchmarks with automated classification, the project establishes a robust baseline to evaluate the capabilities and limitations of LLMs in detecting inflammatory narratives. The dataset and evaluation framework are publicly released to advance research on political discourse in European linguistic contexts.

Technology Category

Application Category

📝 Abstract

Detecting hyperpartisan narratives and Population Replacement Conspiracy Theories (PRCT) is essential to addressing the spread of misinformation. These complex narratives pose a significant threat, as hyperpartisanship drives political polarisation and institutional distrust, while PRCTs directly motivate real-world extremist violence, making their identification critical for social cohesion and public safety. However, existing resources are scarce, predominantly English-centric, and often analyse hyperpartisanship, stance, and rhetorical bias in isolation rather than as interrelated aspects of political discourse. To bridge this gap, we introduce \textsc{PartisanLens}, the first multilingual dataset of \num{1617} hyperpartisan news headlines in Spanish, Italian, and Portuguese, annotated in multiple political discourse aspects. We first evaluate the classification performance of widely used Large Language Models (LLMs) on this dataset, establishing robust baselines for the classification of hyperpartisan and PRCT narratives. In addition, we assess the viability of using LLMs as automatic annotators for this task, analysing their ability to approximate human annotation. Results highlight both their potential and current limitations. Next, moving beyond standard judgments, we explore whether LLMs can emulate human annotation patterns by conditioning them on socio-economic and ideological profiles that simulate annotator perspectives. At last, we provide our resources and evaluation, \textsc{PartisanLens} supports future research on detecting partisan and conspiratorial narratives in European contexts.

Problem

Research questions and friction points this paper is trying to address.

hyperpartisan narratives

Population Replacement Conspiracy Theories

misinformation

political polarisation

multilingual dataset

Innovation

Methods, ideas, or system contributions that make the work stand out.

multilingual dataset

hyperpartisan narratives

conspiracy theories