AKCIT-FN at CheckThat! 2025: Switching Fine-Tuned SLMs and LLM Prompting for Multilingual Claim Normalization

📅 2025-09-14

📈 Citations: 0

✨ Influential: 0

career value

165K/year

🤖 AI Summary

This work addresses the challenge of multilingual claim normalization for automated fact-checking—converting informal, noisy social media text into concise, self-contained, standardized statements. To this end, we propose a unified framework that synergistically combines fine-tuned small language models (SLMs) with large language model (LLM) prompting techniques, enabling effective modeling across both high-resource and low-resource languages. The approach is evaluated on 20 languages (13 supervised, 7 zero-shot). Empirically, it achieves strong cross-lingual generalization: ranking in the top three on the METEOR metric for 15 languages—including eight second-place finishes and five zero-shot languages—while Portuguese attains third place with a score of 0.5290. Notably, this is the first study to empirically validate claim normalization under zero-shot settings, demonstrating its feasibility and establishing a scalable, resource-efficient foundation for multilingual fact-checking.

Technology Category

Application Category

📝 Abstract

Claim normalization, the transformation of informal social media posts into concise, self-contained statements, is a crucial step in automated fact-checking pipelines. This paper details our submission to the CLEF-2025 CheckThat! Task~2, which challenges systems to perform claim normalization across twenty languages, divided into thirteen supervised (high-resource) and seven zero-shot (no training data) tracks. Our approach, leveraging fine-tuned Small Language Models (SLMs) for supervised languages and Large Language Model (LLM) prompting for zero-shot scenarios, achieved podium positions (top three) in fifteen of the twenty languages. Notably, this included second-place rankings in eight languages, five of which were among the seven designated zero-shot languages, underscoring the effectiveness of our LLM-based zero-shot strategy. For Portuguese, our initial development language, our system achieved an average METEOR score of 0.5290, ranking third. All implementation artifacts, including inference, training, evaluation scripts, and prompt configurations, are publicly available at https://github.com/ju-resplande/checkthat2025_normalization.

Problem

Research questions and friction points this paper is trying to address.

Normalizing informal social media claims into concise statements

Performing multilingual claim normalization across twenty languages

Addressing zero-shot scenarios with no training data available

Innovation

Methods, ideas, or system contributions that make the work stand out.

Fine-tuned SLMs for supervised languages

LLM prompting for zero-shot scenarios

Multilingual claim normalization system

🔎 Similar Papers

Claim Verification in the Age of Large Language Models: A Survey