PartisanLens: A Multilingual Dataset of Hyperpartisan and Conspiratorial Immigration Narratives in European Media

📅 2026-01-07
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the paucity of systematic analyses of extreme partisanship and the “great replacement” conspiracy theory in European multilingual political discourse, as existing research has predominantly focused on English-language contexts. The authors construct the first multilingual dataset comprising 1,617 news headlines in Spanish, Italian, and Portuguese, annotated across multiple dimensions of political rhetoric. For the first time, they integrate socioeconomic and ideological profiles to guide large language models (LLMs) in simulating human annotation behavior from diverse political standpoints. By combining human-annotated benchmarks with automated classification, the project establishes a robust baseline to evaluate the capabilities and limitations of LLMs in detecting inflammatory narratives. The dataset and evaluation framework are publicly released to advance research on political discourse in European linguistic contexts.

Technology Category

Application Category

📝 Abstract
Detecting hyperpartisan narratives and Population Replacement Conspiracy Theories (PRCT) is essential to addressing the spread of misinformation. These complex narratives pose a significant threat, as hyperpartisanship drives political polarisation and institutional distrust, while PRCTs directly motivate real-world extremist violence, making their identification critical for social cohesion and public safety. However, existing resources are scarce, predominantly English-centric, and often analyse hyperpartisanship, stance, and rhetorical bias in isolation rather than as interrelated aspects of political discourse. To bridge this gap, we introduce \textsc{PartisanLens}, the first multilingual dataset of \num{1617} hyperpartisan news headlines in Spanish, Italian, and Portuguese, annotated in multiple political discourse aspects. We first evaluate the classification performance of widely used Large Language Models (LLMs) on this dataset, establishing robust baselines for the classification of hyperpartisan and PRCT narratives. In addition, we assess the viability of using LLMs as automatic annotators for this task, analysing their ability to approximate human annotation. Results highlight both their potential and current limitations. Next, moving beyond standard judgments, we explore whether LLMs can emulate human annotation patterns by conditioning them on socio-economic and ideological profiles that simulate annotator perspectives. At last, we provide our resources and evaluation, \textsc{PartisanLens} supports future research on detecting partisan and conspiratorial narratives in European contexts.
Problem

Research questions and friction points this paper is trying to address.

hyperpartisan narratives
Population Replacement Conspiracy Theories
misinformation
political polarisation
multilingual dataset
Innovation

Methods, ideas, or system contributions that make the work stand out.

multilingual dataset
hyperpartisan narratives
conspiracy theories
large language models
automatic annotation
🔎 Similar Papers
No similar papers found.
M
M. Maggini
Centro Singular de Investigación en Tecnoloxías Intelixentes da USC
Paloma Piot
Paloma Piot
MSCA PhD Candidate
Hate Speech detectionNLP
A
Anxo P'erez
IRLab, CITIC Research Centre, Universidade da Coruña
E
Erik Bran Marino
Universidade de Évora
L
L'ua Santamar'ia Montesinos
Universidad de La Rioja
A
Ana Lisboa
GESIS Leibniz Institute for the Social Sciences
M
Marta V'azquez Abu'in
Centro Singular de Investigación en Tecnoloxías Intelixentes da USC
Javier Parapar
Javier Parapar
Information Retrieval Lab - CITIC - University of A Coruña
Information RetrievalRecommender SystemsText Mining
P
P. Gamallo
Centro Singular de Investigación en Tecnoloxías Intelixentes da USC