🤖 AI Summary
This work addresses the challenge of multilingual claim normalization for automated fact-checking—converting informal, noisy social media text into concise, self-contained, standardized statements. To this end, we propose a unified framework that synergistically combines fine-tuned small language models (SLMs) with large language model (LLM) prompting techniques, enabling effective modeling across both high-resource and low-resource languages. The approach is evaluated on 20 languages (13 supervised, 7 zero-shot). Empirically, it achieves strong cross-lingual generalization: ranking in the top three on the METEOR metric for 15 languages—including eight second-place finishes and five zero-shot languages—while Portuguese attains third place with a score of 0.5290. Notably, this is the first study to empirically validate claim normalization under zero-shot settings, demonstrating its feasibility and establishing a scalable, resource-efficient foundation for multilingual fact-checking.
📝 Abstract
Claim normalization, the transformation of informal social media posts into concise, self-contained statements, is a crucial step in automated fact-checking pipelines. This paper details our submission to the CLEF-2025 CheckThat! Task~2, which challenges systems to perform claim normalization across twenty languages, divided into thirteen supervised (high-resource) and seven zero-shot (no training data) tracks.
Our approach, leveraging fine-tuned Small Language Models (SLMs) for supervised languages and Large Language Model (LLM) prompting for zero-shot scenarios, achieved podium positions (top three) in fifteen of the twenty languages. Notably, this included second-place rankings in eight languages, five of which were among the seven designated zero-shot languages, underscoring the effectiveness of our LLM-based zero-shot strategy. For Portuguese, our initial development language, our system achieved an average METEOR score of 0.5290, ranking third. All implementation artifacts, including inference, training, evaluation scripts, and prompt configurations, are publicly available at https://github.com/ju-resplande/checkthat2025_normalization.