NCV: A Node-Wise Consistency Verification Approach for Low-Cost Structured Error Localization in LLM Reasoning

πŸ“… 2025-10-03
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
To address inaccurate error localization and high verification costs in multi-step reasoning of large language models (LLMs), this paper proposes the Node-level Consistency Verification (NCV) framework. NCV decomposes reasoning chains into fine-grained, verifiable nodes and formulates verification as a zero-shot, training-free, lightweight binary classification taskβ€”assessing consistency at each node to enable precise error localization. Unlike conventional chain-level evaluation or costly multiple-sampling approaches, NCV eliminates long-sequence generation and mitigates attention dilution. On public benchmarks, NCV improves F1 scores by 10–25% over baselines while reducing token consumption to only 1/6–1/58 of that required by standard methods. This yields substantial gains in verification efficiency, accuracy, and interpretability, offering a scalable and principled solution for LLM reasoning validation.

Technology Category

Application Category

πŸ“ Abstract
Verifying multi-step reasoning in large language models is difficult due to imprecise error localization and high token costs. Existing methods either assess entire reasoning chains, suffering attention dilution, or rely on expensive multi-sampling. We introduce Node-wise Consistency Verification (NCV), a training-free framework that recasts verification as lightweight binary consistency checks at the node level. By decomposing the chain of thought into interconnected verification nodes, NCV precisely localizes errors and avoids unnecessary long-form generation. Experiments demonstrate that our approach enhances interpretability and efficiency, presenting a scalable solution for reliable LLM reasoning verification. On public datasets, NCV achieves a 10% to 25% improvement in F1 scores over baselines while utilizing $6 imes$~$58 imes$ fewer tokens than traditional methods like CoT-based verifiers.
Problem

Research questions and friction points this paper is trying to address.

Verifying multi-step reasoning in large language models
Imprecise error localization with high token costs
Assessing reasoning chains suffers from attention dilution
Innovation

Methods, ideas, or system contributions that make the work stand out.

Node-wise binary consistency checks for verification
Training-free framework avoiding long-form generation
Lightweight decomposition of reasoning chains into nodes
Yulong Zhang
Yulong Zhang
Google
Security and Privacy
L
Li Wang
Ant Group
W
Wei Du
Shanghai Jiao Tong University
Peilin Li
Peilin Li
National University of Singapore
Machine LearningArchitectureGenerative Design
Yuqin Dai
Yuqin Dai
Tsinghua University
LLMAI4ScienceAvatarGenerative Model
Z
Zhiyuan Zhao
Ant Group
L
Lingyong Fang
Shanghai Jiao Tong University
Z
Ziniu Liu
Ant Group
R
Ru Zhang
Beijing University of Posts and Telecommunications
H
Huijia Zhu
Ant Group
G
Gongshen Liu
Inner Mongolia Research Institute of SJTU