🤖 AI Summary
This study addresses the challenge of identifying polarizing content in multilingual, multicultural social media by jointly tackling three subtasks—binary polarization detection, target classification, and manifestation identification—across 22 languages within the SemEval-2026 Task 9 framework. The authors propose an ensemble framework based on heterogeneous multilingual models (XLM-RoBERTa-large and mDeBERTa-v3-base), integrating task-specific modeling, class-weighting strategies, and translation-based data augmentation to effectively mitigate severe label sparsity and class imbalance. Experimental results demonstrate that the proposed approach achieves strong performance across all three subtasks, confirming its effectiveness and robustness in cross-lingual polarization detection.
📝 Abstract
This paper presents our system for SemEval-2026 Task 9: Detecting Multilingual, Multicultural and Multievent Online Polarization, which identifies polarized social media content in 22 languages through three subtasks: binary detection, target classification, and manifestation identification. We propose a heterogeneous ensemble of multilingual pretrained models, combining XLM-RoBERTa-large and mDeBERTa-v3-base. We investigate techniques such as multi-task learning, translation-based data augmentation, and class weighting to improve classification performance under severe label imbalance. Our findings indicate that independent task modeling combined with class weighting is more effective.