mdok-style at SemEval-2026 Task 9: Finetuning LLMs for Multilingual Polarization Detection

📅 2026-05-04
📈 Citations: 0
Influential: 0
📄 PDF

career value

187K/year
🤖 AI Summary
This study addresses the urgent need for early detection of online polarization, which can readily escalate into hate speech and social fragmentation in multilingual and multicultural contexts. To this end, the authors propose a parameter-efficient fine-tuning approach based on QLoRA for multilingual large language models, framed as a sequence classification task. The method incorporates a data augmentation strategy involving anonymization, case transformation, and homoglyph character variants to substantially enhance model robustness and cross-lingual generalization. The resulting polarization detection system supports 22 languages and demonstrates strong performance across all three subtasks of SemEval-2026 Task 9, offering an effective solution for multilingual polarization analysis.
📝 Abstract
SemEval-2026 Task 9 is focused on multilingual polarization detection. Specifically, it covers the identification of multilingual, multicultural and multievent polarization along three axes (in subtasks), namely detection, type, and manifestation. Online polarization presents a concern, because it is often followed by hate speech, offensive discourse, and social fragmentation. Therefore, its detection before it escalates is crucial for a safer and more inclusive online space. We have coped with this SemEval task by finetuning mid-size LLMs for the sequence-classification task using the QLoRA parameter-efficient finetuning technique. The training data augmented the multilingual (22 languages) training sets by anonymized, lower-cased, upper-cased, and homoglyphied counterparts, making the detection more robust.
Problem

Research questions and friction points this paper is trying to address.

multilingual polarization detection
online polarization
hate speech
social fragmentation
polarization manifestation
Innovation

Methods, ideas, or system contributions that make the work stand out.

QLoRA
parameter-efficient finetuning
multilingual polarization detection
data augmentation
large language models
🔎 Similar Papers
No similar papers found.