Rotation-Preserving Supervised Fine-Tuning

📅 2026-05-08
📈 Citations: 0
Influential: 0
📄 PDF

career value

187K/year
🤖 AI Summary
Although supervised fine-tuning enhances in-domain performance, it often degrades out-of-domain generalization due to detrimental rotations in the dominant singular subspace of pre-trained weights. This work proposes constraining the projection rotation of the top-k singular vector blocks to efficiently approximate loss-sensitive directions, thereby suppressing harmful parameter updates while adapting to downstream tasks. The method explicitly preserves critical representational structures at large model scales for the first time, leveraging projection rotation as an efficient proxy for Fisher information–sensitive directions. By integrating singular value decomposition, rotation-based regularization, and supervised fine-tuning, the approach consistently improves the trade-off between in-domain and out-of-domain performance across multiple model families and scales, better retains pre-trained knowledge, and yields superior initializations for downstream reinforcement learning.
📝 Abstract
Supervised fine-tuning (SFT) improves in-domain performance but can degrade out-of-domain (OOD) generalization. Prior work suggests that this degradation is related to changes in dominant singular subspaces of pretrained weight matrices. However, directly identifying loss-sensitive directions with Hessian or Fisher information is computationally expensive at LLM scale. In this work, we propose preserving projected rotations in pretrained singular subspaces as an efficient proxy for Fisher-sensitive directions, which we call Rotation-Preserving Supervised Fine-Tuning (RPSFT). RPSFT penalizes changes in the projected top-$k$ singular-vector block of each pretrained weight matrix, limiting unnecessary rotation while preserving task adaptation. Across model families and sizes trained on math reasoning data, RPSFT improves the in-domain/OOD trade-off over standard SFT and strong SFT baselines, better preserves pretrained representations, and provides stronger initializations for downstream RL fine-tuning. Code is available at \href{https://github.com/jinhangzhan/RPSFT.git}{https://github.com/jinhangzhan/RPSFT}.
Problem

Research questions and friction points this paper is trying to address.

supervised fine-tuning
out-of-domain generalization
pretrained representations
LLM adaptation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Rotation-Preserving SFT
singular subspaces
OOD generalization
Fisher-sensitive directions
supervised fine-tuning