🤖 AI Summary
To address the rapid spread of misinformation in multilingual (especially low-resource) social media, this paper introduces the first multilingual and cross-lingual claim retrieval task and evaluation framework for fact-checking. Methodologically, it integrates multilingual pretrained encoders, cross-lingual alignment of representations, retrieval-oriented fine-tuning, and zero-shot transfer techniques to enable unsupervised or weakly supervised cross-lingual matching. Its core contributions include: (1) the first systematically constructed low-resource-friendly multilingual fact-checking retrieval benchmark, advancing fact-checking from an English-centric paradigm toward global language coverage; (2) the organization of an international shared task, attracting 179 participants and 52 submissions; and (3) state-of-the-art performance achieving 68.3% Recall@10 in multilingual settings—substantially outperforming all baselines.
📝 Abstract
The rapid spread of online disinformation presents a global challenge, and machine learning has been widely explored as a potential solution. However, multilingual settings and low-resource languages are often neglected in this field. To address this gap, we conducted a shared task on multilingual claim retrieval at SemEval 2025, aimed at identifying fact-checked claims that match newly encountered claims expressed in social media posts across different languages. The task includes two subtracks: (1) a monolingual track, where social posts and claims are in the same language, and (2) a crosslingual track, where social posts and claims might be in different languages. A total of 179 participants registered for the task contributing to 52 test submissions. 23 out of 31 teams have submitted their system papers. In this paper, we report the best-performing systems as well as the most common and the most effective approaches across both subtracks. This shared task, along with its dataset and participating systems, provides valuable insights into multilingual claim retrieval and automated fact-checking, supporting future research in this field.