🤖 AI Summary
This study addresses the challenge of identifying polarization in online texts within multilingual and multicultural contexts. It presents the first systematic definition and annotation of polarization—covering its presence, types, and manifestations—at scale across multiple languages. The authors construct a multilingual, multi-label annotated dataset comprising over 110,000 samples spanning 22 languages and formulate three subtasks to evaluate computational approaches. Leveraging multilingual natural language processing and cross-lingual transfer learning frameworks, they organized an international shared task that attracted 67 teams worldwide, resulting in more than 10,000 system submissions. The released dataset and baseline models establish foundational resources for future research on computational modeling of polarization across cultures.
📝 Abstract
We present SemEval-2026 Task 9, a shared task on online polarization detection, covering 22 languages and comprising over 110K annotated instances. Each data instance is multi-labeled with the presence of polarization, polarization type, and polarization manifestation. Participants were asked to predict labels in three sub-tasks: (1) detecting the presence of polarization, (2) identifying the type of polarization, and (3) recognizing the polarization manifestation. The three tasks attracted over 1,000 participants worldwide and more than 10k submission on Codabench. We received final submissions from 67 teams and 73 system description papers. We report the baseline results and analyze the performance of the best-performing systems, highlighting the most common approaches and the most effective methods across different subtasks and languages. The dataset of this task is publicly available.