🤖 AI Summary
Large language models (LLMs) frequently generate hallucinated or factually incorrect content. Existing mitigation strategies either overlook the model’s intrinsic self-correction capability or rely on costly post-hoc verification. To address this, we propose Dynamic Self-Validating Decoding (DSVD), a novel decoding framework that detects and rectifies hallucinations *during* generation. DSVD introduces the first multi-branch parallel self-validation architecture, integrating a lightweight token-level quality evaluator and a dynamic rollback mechanism—thereby embedding self-validation deeply into the decoding process itself. Crucially, DSVD is fully compatible with mainstream faithful decoding methods, requires no additional training, and operates without external verifiers. Evaluated across five benchmark datasets, DSVD significantly improves question-answering factual fidelity and FActScore accuracy, achieving a favorable trade-off between reliability and inference efficiency.
📝 Abstract
The reliability of large language models remains a critical challenge, particularly due to their susceptibility to hallucinations and factual inaccuracies during text generation. Existing solutions either underutilize models' self-correction with preemptive strategies or use costly post-hoc verification. To further explore the potential of real-time self-verification and correction, we present Dynamic Self-Verify Decoding (DSVD), a novel decoding framework that enhances generation reliability through real-time hallucination detection and efficient error correction. DSVD integrates two key components: (1) parallel self-verification architecture for continuous quality assessment, (2) dynamic rollback mechanism for targeted error recovery. Extensive experiments across five benchmarks demonstrate DSVD's effectiveness, achieving significant improvement in truthfulness (Quesetion-Answering) and factual accuracy (FActScore). Results show the DSVD can be further incorporated with existing faithful decoding methods to achieve stronger performance. Our work establishes that real-time self-verification during generation offers a viable path toward more trustworthy language models without sacrificing practical deployability.