DiVA: Fine-grained Factuality Verification with Agentic-Discriminative Verifier

📅 2026-01-07

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

198K/year

🤖 AI Summary

Existing factuality verification methods predominantly rely on binary judgments, which fail to capture the severity of factual errors and thus limit their utility in fine-grained evaluation and preference optimization. To address this limitation, this work proposes DiVA, a novel framework that introduces a hybrid active-discriminative architecture for end-to-end fine-grained factuality verification. DiVA integrates a large language model–driven agent for active retrieval with a discriminative scoring model to assess factual consistency at a granular level. Furthermore, the authors construct FGVeriBench, a new benchmark designed to support fine-grained factuality evaluation. Experimental results demonstrate that DiVA significantly outperforms existing approaches on FGVeriBench, particularly excelling in general and multi-hop question scenarios.

Technology Category

Application Category

📝 Abstract

Despite the significant advancements of Large Language Models (LLMs), their factuality remains a critical challenge, fueling growing interest in factuality verification. Existing research on factuality verification primarily conducts binary judgments (e.g., correct or incorrect), which fails to distinguish varying degrees of error severity. This limits its utility for applications such as fine-grained evaluation and preference optimization. To bridge this gap, we propose the Agentic Discriminative Verifier (DiVA), a hybrid framework that synergizes the agentic search capabilities of generative models with the precise scoring aptitude of discriminative models. We also construct a new benchmark, FGVeriBench, as a robust testbed for fine-grained factuality verification. Experimental results on FGVeriBench demonstrate that our DiVA significantly outperforms existing methods on factuality verification for both general and multi-hop questions.

Problem

Research questions and friction points this paper is trying to address.

factuality verification

fine-grained evaluation

Large Language Models

error severity

binary judgment

Innovation

Methods, ideas, or system contributions that make the work stand out.

fine-grained factuality verification

agentic-discriminative verifier

hybrid verification framework