UrduFactCheck: An Agentic Fact-Checking Framework for Urdu with Evidence Boosting and Benchmarking

📅 2025-05-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Addressing the challenges of evidence scarcity and low factual consistency of large language models (LLMs) in automated fact-checking for Urdu—a low-resource language with 200 million speakers—this paper introduces the first modular, Urdu-specific fact-checking framework. Methodologically, it proposes a dynamic multi-strategy evidence retrieval mechanism that integrates monolingual retrieval with translation-augmented cross-lingual evidence transfer, and designs an agent-based verification pipeline combining LLM orchestration with human-in-the-loop validation. Key contributions include: (1) releasing two high-quality, manually annotated benchmarks—UrduFactBench (claim-level) and UrduFactQA (question-answer-level); (2) open-sourcing all code, datasets, and models; and (3) conducting systematic evaluation of 12 state-of-the-art LLMs, demonstrating that translation-augmented variants significantly outperform baseline and leading open-source models in Urdu fact-checking accuracy.

Technology Category

Application Category

📝 Abstract
The rapid use of large language models (LLMs) has raised critical concerns regarding the factual reliability of their outputs, especially in low-resource languages such as Urdu. Existing automated fact-checking solutions overwhelmingly focus on English, leaving a significant gap for the 200+ million Urdu speakers worldwide. In this work, we introduce UrduFactCheck, the first comprehensive, modular fact-checking framework specifically tailored for Urdu. Our system features a dynamic, multi-strategy evidence retrieval pipeline that combines monolingual and translation-based approaches to address the scarcity of high-quality Urdu evidence. We curate and release two new hand-annotated benchmarks: UrduFactBench for claim verification and UrduFactQA for evaluating LLM factuality. Extensive experiments demonstrate that UrduFactCheck, particularly its translation-augmented variants, consistently outperforms baselines and open-source alternatives on multiple metrics. We further benchmark twelve state-of-the-art (SOTA) LLMs on factual question answering in Urdu, highlighting persistent gaps between proprietary and open-source models. UrduFactCheck's code and datasets are open-sourced and publicly available at https://github.com/mbzuai-nlp/UrduFactCheck.
Problem

Research questions and friction points this paper is trying to address.

Addressing factual reliability of LLM outputs in Urdu
Bridging the gap in automated Urdu fact-checking solutions
Enhancing evidence retrieval for low-resource Urdu language
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic multi-strategy evidence retrieval pipeline
Combines monolingual and translation-based approaches
Introduces UrduFactBench and UrduFactQA benchmarks
🔎 Similar Papers
No similar papers found.