PRISM: Probability Reallocation with In-Span Masking for Knowledge-Sensitive Alignment

📅 2026-04-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the issue of hallucination in multi-sentence generation caused by overconfident predictions when supervised fine-tuning employs hard labels, which often lack grounding in factual evidence. To mitigate this, the authors propose PRISM, a framework that extends standard supervised fine-tuning with a differentiable risk-gating mechanism. PRISM integrates sentence-level factual risk labels and inter-sentence dependency cues to probabilistically redistribute confidence at fact-critical positions, thereby suppressing high-confidence erroneous predictions. The approach innovatively combines span-level risk weights with model-aware gating to achieve localized, lightweight knowledge-sensitive alignment. Experiments demonstrate that PRISM significantly improves factual accuracy across multiple fact-sensitive benchmarks while preserving overall generation quality, and ablation studies confirm the effectiveness and complementarity of its core components.
📝 Abstract
Supervised fine-tuning (SFT) with token-level hard labels can amplify overconfident imitation of factually unsupported targets, causing hallucinations that propagate in multi-sentence generation. We study an augmented SFT setting in which training instances include coarse sentence-level factuality risk labels and inter-sentence dependency annotations, providing structured signals about where factual commitments are weakly supported. We propose \textbf{PRISM}, a differentiable risk-gated framework that modifies learning only at fact-critical positions. PRISM augments standard SFT with a lightweight, model-aware probability reallocation objective that penalizes high-confidence predictions on risky target tokens, with its scope controlled by span-level risk weights and model-aware gating. Experiments on hallucination-sensitive factual benchmarks and general evaluations show that PRISM improves factual aggregates across backbones while maintaining a competitive overall capability profile. Ablations further show that the auxiliary signal is most effective when used conservatively, and that knowledge masking and model-aware reallocation play complementary roles in balancing factual correction and capability preservation.
Problem

Research questions and friction points this paper is trying to address.

hallucination
supervised fine-tuning
factuality
knowledge-sensitive alignment
overconfident imitation
Innovation

Methods, ideas, or system contributions that make the work stand out.

probability reallocation
in-span masking
risk-gated learning
factuality alignment
hallucination mitigation
🔎 Similar Papers
No similar papers found.
C
Chenning Xu
Large Language Model Department, Tencent, China
M
Mao Zheng
Large Language Model Department, Tencent, China
Mingyang Song
Mingyang Song
Tencent Inc.
NLPIRLLMs