OSCAR: Orchestrated Self-verification and Cross-path Refinement

📅 2026-04-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the hallucination problem in diffusion language models, which arises from premature commitment to unreliable content during generation. The authors propose a training-free, inference-time framework that leverages the native denoising trajectories of diffusion models to detect high-uncertainty regions in an unsupervised manner. Specifically, it exploits multi-path parallel denoising and cross-chain Shannon entropy to identify potentially erroneous segments, then dynamically re-masks these suspicious spans and incorporates external evidence retrieval for correction. Experimental results demonstrate that this approach significantly reduces hallucination rates and improves factual accuracy across multiple question-answering benchmarks, outperforming specialized trained hallucination detectors without requiring any model fine-tuning.
📝 Abstract
Diffusion language models (DLMs) expose their denoising trajectories, offering a natural handle for inference-time control; accordingly, an ideal hallucination mitigation framework should intervene during generation using this model-native signal rather than relying on an externally trained hallucination classifier. Toward this, we formulate commitment uncertainty localization: given a denoising trajectory, identify token positions whose cross-chain entropy exceeds an unsupervised threshold before factually unreliable commitments propagate into self-consistent but incorrect outputs. We introduce a suite of trajectory-level assessments, including a cross-chain divergence-at-hallucination (CDH) metric, for principled comparison of localization methods. We also introduce OSCAR, a training-free inference-time framework operationalizing this formulation. OSCAR runs N parallel denoising chains with randomized reveal orders, computes cross-chain Shannon entropy to detect high-uncertainty positions, and then performs targeted remasking conditioned on retrieved evidence. Ablations confirm that localization and correction contribute complementary gains, robust across N in {4, 8, 16}. On TriviaQA, HotpotQA, RAGTruth, and CommonsenseQA using LLaDA-8B and Dream-7B, OSCAR enhances generation quality by significantly reducing hallucinated content and improving factual accuracy through uncertainty-guided remasking, which also facilitates more effective integration of retrieved evidence. Its native entropy-based uncertainty signal surpasses that of specialized trained detectors, highlighting an inherent capacity of diffusion language models to identify factual uncertainty that is not present in the sequential token commitment structure of autoregressive models. We are releasing the codebase1 to support future research on localization and uncertainty-aware generation in DLMs.
Problem

Research questions and friction points this paper is trying to address.

hallucination mitigation
diffusion language models
uncertainty localization
inference-time control
factual accuracy
Innovation

Methods, ideas, or system contributions that make the work stand out.

diffusion language models
hallucination mitigation
uncertainty localization
cross-chain entropy
inference-time control
🔎 Similar Papers
No similar papers found.