PruneCD: Contrasting Pruned Self Model to Improve Decoding Factuality

📅 2025-09-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address hallucination in large language models (LLMs), this paper proposes a layer-pruning-based contrastive decoding method. Instead of relying on conventional early-exit mechanisms, it constructs a lightweight, domain-agnostic “contrastive model” by pruning the top layers of the Transformer architecture; this pruned model runs in parallel with the full-parameter model during inference, and their logits are dynamically weighted and fused. Leveraging the smoother, fact-distribution-aligned outputs of the pruned model, the approach strengthens discriminative contrastive signals. Experiments demonstrate substantial improvements in factual accuracy—e.g., +3.2–7.8 points on FactScore and FEVER—while incurring only ~12% additional latency, ensuring practical inference overhead. The core contribution is the first use of structured layer pruning to instantiate a contrastive model, enabling efficient, plug-and-play factual consistency enhancement.

Technology Category

Application Category

📝 Abstract
To mitigate the hallucination problem in large language models, DoLa exploits early exit logits from the same model as a contrastive prior. However, we found that these early exit logits tend to be flat, low in magnitude, and fail to reflect meaningful contrasts. To address this, we propose PruneCD, a novel contrastive decoding method that constructs the amateur model via layer pruning rather than early exit. This design leads to more informative and well-aligned logits, enabling more effective contrastive decoding. Through qualitative and quantitative analyses, we demonstrate that PruneCD consistently improves factuality with minimal inference overhead, offering a robust and practical approach to mitigating hallucinations in LLMs.
Problem

Research questions and friction points this paper is trying to address.

Mitigating hallucination problems in large language models
Improving decoding factuality through contrastive methods
Addressing flat early exit logits in existing approaches
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses layer pruning for amateur model construction
Creates informative aligned logits for contrast
Improves factuality with minimal inference overhead
🔎 Similar Papers
No similar papers found.
Byeongho Yu
Byeongho Yu
URobotics Corp.
Field RoboticsLegged RobotsState EstimationVisual-Inertial SLAMVisual-Inertial Odometry
C
Changhun Lee
Department of Convergence IT Engineering, Pohang University of Science and Technology (POSTECH)
J
Jungyu Jin
Graduate School of Artificial Intelligence, Pohang University of Science and Technology (POSTECH)
Eunhyeok Park
Eunhyeok Park
POSTECH
neural network optimizationenergy efficient hardware design