🤖 AI Summary
This paper addresses SemEval 2025 Task 7—multilingual fact-checking retrieval—by retrieving semantically matching, human-verified fact checks for given social media claims from the multilingual MultiClaim dataset. To overcome zero-shot cross-lingual retrieval bottlenecks, we propose a supervised fine-tuning framework integrating machine translation: leveraging mBERT or XLM-R as backbones, we introduce back-translation–enhanced cross-lingual alignment and bilingual collaborative optimization. Our method simultaneously improves retrieval accuracy in both monolingual and cross-lingual settings, achieving 92% and 85% accuracy, respectively—substantially outperforming baselines and ranking among the top systems on the official leaderboard. The core contribution lies in explicitly modeling translation fidelity within the retrieval fine-tuning process, thereby unifying semantic alignment and domain adaptation into a single optimization objective.
📝 Abstract
This paper describes our system for SemEval 2025 Task 7: Previously Fact-Checked Claim Retrieval. The task requires retrieving relevant fact-checks for a given input claim from the extensive, multilingual MultiClaim dataset, which comprises social media posts and fact-checks in several languages. To address this challenge, we first evaluated zero-shot performance using state-of-the-art English and multilingual retrieval models and then fine-tuned the most promising systems, leveraging machine translation to enhance crosslingual retrieval. Our best model achieved an accuracy of 85% on crosslingual data and 92% on monolingual data.