mdok-style at SemEval-2026 Task 9: Finetuning LLMs for Multilingual Polarization Detection

📅 2026-05-04

📈 Citations: 0

✨ Influential: 0

career value

211K/year

🤖 AI Summary

This study addresses the urgent need for early detection of online polarization, which can readily escalate into hate speech and social fragmentation in multilingual and multicultural contexts. To this end, the authors propose a parameter-efficient fine-tuning approach based on QLoRA for multilingual large language models, framed as a sequence classification task. The method incorporates a data augmentation strategy involving anonymization, case transformation, and homoglyph character variants to substantially enhance model robustness and cross-lingual generalization. The resulting polarization detection system supports 22 languages and demonstrates strong performance across all three subtasks of SemEval-2026 Task 9, offering an effective solution for multilingual polarization analysis.

📝 Abstract

SemEval-2026 Task 9 is focused on multilingual polarization detection. Specifically, it covers the identification of multilingual, multicultural and multievent polarization along three axes (in subtasks), namely detection, type, and manifestation. Online polarization presents a concern, because it is often followed by hate speech, offensive discourse, and social fragmentation. Therefore, its detection before it escalates is crucial for a safer and more inclusive online space. We have coped with this SemEval task by finetuning mid-size LLMs for the sequence-classification task using the QLoRA parameter-efficient finetuning technique. The training data augmented the multilingual (22 languages) training sets by anonymized, lower-cased, upper-cased, and homoglyphied counterparts, making the detection more robust.

Problem

Research questions and friction points this paper is trying to address.

multilingual polarization detection

online polarization

hate speech

social fragmentation

polarization manifestation

Innovation

Methods, ideas, or system contributions that make the work stand out.

QLoRA

parameter-efficient finetuning

multilingual polarization detection

data augmentation

large language models

🔎 Similar Papers

No similar papers found.