🤖 AI Summary
This study addresses the urgent need for early detection of online polarization, which can readily escalate into hate speech and social fragmentation in multilingual and multicultural contexts. To this end, the authors propose a parameter-efficient fine-tuning approach based on QLoRA for multilingual large language models, framed as a sequence classification task. The method incorporates a data augmentation strategy involving anonymization, case transformation, and homoglyph character variants to substantially enhance model robustness and cross-lingual generalization. The resulting polarization detection system supports 22 languages and demonstrates strong performance across all three subtasks of SemEval-2026 Task 9, offering an effective solution for multilingual polarization analysis.
📝 Abstract
SemEval-2026 Task 9 is focused on multilingual polarization detection. Specifically, it covers the identification of multilingual, multicultural and multievent polarization along three axes (in subtasks), namely detection, type, and manifestation. Online polarization presents a concern, because it is often followed by hate speech, offensive discourse, and social fragmentation. Therefore, its detection before it escalates is crucial for a safer and more inclusive online space. We have coped with this SemEval task by finetuning mid-size LLMs for the sequence-classification task using the QLoRA parameter-efficient finetuning technique. The training data augmented the multilingual (22 languages) training sets by anonymized, lower-cased, upper-cased, and homoglyphied counterparts, making the detection more robust.