DiVA: Fine-grained Factuality Verification with Agentic-Discriminative Verifier

πŸ“… 2026-01-07
πŸ›οΈ arXiv.org
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Existing factuality verification methods predominantly rely on binary judgments, which fail to capture the severity of factual errors and thus limit their utility in fine-grained evaluation and preference optimization. To address this limitation, this work proposes DiVA, a novel framework that introduces a hybrid active-discriminative architecture for end-to-end fine-grained factuality verification. DiVA integrates a large language model–driven agent for active retrieval with a discriminative scoring model to assess factual consistency at a granular level. Furthermore, the authors construct FGVeriBench, a new benchmark designed to support fine-grained factuality evaluation. Experimental results demonstrate that DiVA significantly outperforms existing approaches on FGVeriBench, particularly excelling in general and multi-hop question scenarios.

Technology Category

Application Category

πŸ“ Abstract
Despite the significant advancements of Large Language Models (LLMs), their factuality remains a critical challenge, fueling growing interest in factuality verification. Existing research on factuality verification primarily conducts binary judgments (e.g., correct or incorrect), which fails to distinguish varying degrees of error severity. This limits its utility for applications such as fine-grained evaluation and preference optimization. To bridge this gap, we propose the Agentic Discriminative Verifier (DiVA), a hybrid framework that synergizes the agentic search capabilities of generative models with the precise scoring aptitude of discriminative models. We also construct a new benchmark, FGVeriBench, as a robust testbed for fine-grained factuality verification. Experimental results on FGVeriBench demonstrate that our DiVA significantly outperforms existing methods on factuality verification for both general and multi-hop questions.
Problem

Research questions and friction points this paper is trying to address.

factuality verification
fine-grained evaluation
Large Language Models
error severity
binary judgment
Innovation

Methods, ideas, or system contributions that make the work stand out.

fine-grained factuality verification
agentic-discriminative verifier
hybrid verification framework
FGVeriBench
LLM factuality evaluation
πŸ”Ž Similar Papers
No similar papers found.
Hui Huang
Hui Huang
Harbin Institute of Technology
Large Language Model
M
Muyun Yang
Harbin Institute of Technology
Y
Yuki Arase
Institute of Science Tokyo