🤖 AI Summary
This work addresses the granularity mismatch in supervised fine-tuning (SFT), where fine-grained autoregressive generation is constrained by coarse-grained or uniform supervision signals, hindering precise alignment with human utility. To resolve this, the paper introduces Token Priority as a foundational mechanism, reframing SFT as a distribution reshaping process oriented toward the alignment manifold. It formally defines Token Priority for the first time, distinguishing Positive Priority—employed for noise filtering—from Signed Priority—designed to unlearn harmful patterns—and provides a unified interpretation of recent advances in SFT. Building on this framework, the study systematically analyzes limitations of existing approaches and elucidates the potential of Token Priority to enhance both alignment efficacy and model safety, outlining promising directions for future research.
📝 Abstract
The transition from fitting empirical data to achieving true human utility is fundamentally constrained by a granularity mismatch, where fine-grained autoregressive generation is often supervised by coarse or uniform signals. This position paper advocates Token Priority as the essential bridge, formalizing Supervised Fine-Tuning (SFT) not as simple optimization but as a precise distribution reshaping process that aligns raw data with the ideal alignment manifold. We analyze recent breakthroughs through this unified lens, categorizing them into two distinct regimes: Positive Priority for noise filtration and Signed Priority for toxic modes unlearning. We revisit existing progress and limitations, identify key challenges, and suggest directions for future research.