Large Language Models for Multilingual Previously Fact-Checked Claim Detection

📅 2025-03-04

📈 Citations: 0

✨ Influential: 0

career value

169K/year

🤖 AI Summary

To address redundant fact-checking efforts caused by cross-lingual misinformation dissemination, this paper presents the first systematic study of large language models (LLMs) on multilingual debunked claim detection. We propose a cross-lingual semantic matching framework integrating prompt engineering, zero-/few-shot inference, and machine translation—specifically English-to-Chinese and low-resource-language-to-English translation. Experiments are conducted across seven state-of-the-art LLMs. We introduce the first benchmark dataset covering 20 languages. Results show that translating low-resource language inputs into English significantly improves detection performance (average +18.7%), revealing novel insights into LLMs’ cross-lingual semantic alignment capabilities; high-resource languages achieve high accuracy. This work provides a scalable, language-agnostic technical foundation for global multilingual fact-checking.

Technology Category

Application Category

📝 Abstract

In our era of widespread false information, human fact-checkers often face the challenge of duplicating efforts when verifying claims that may have already been addressed in other countries or languages. As false information transcends linguistic boundaries, the ability to automatically detect previously fact-checked claims across languages has become an increasingly important task. This paper presents the first comprehensive evaluation of large language models (LLMs) for multilingual previously fact-checked claim detection. We assess seven LLMs across 20 languages in both monolingual and cross-lingual settings. Our results show that while LLMs perform well for high-resource languages, they struggle with low-resource languages. Moreover, translating original texts into English proved to be beneficial for low-resource languages. These findings highlight the potential of LLMs for multilingual previously fact-checked claim detection and provide a foundation for further research on this promising application of LLMs.

Problem

Research questions and friction points this paper is trying to address.

Detecting previously fact-checked claims across multiple languages

Evaluating large language models for multilingual claim detection

Addressing challenges in low-resource language performance

Innovation

Methods, ideas, or system contributions that make the work stand out.

Evaluates LLMs for multilingual claim detection

Tests seven LLMs across 20 languages

Uses translation to improve low-resource language performance

🔎 Similar Papers

Claim Verification in the Age of Large Language Models: A Survey