Rotation-Preserving Supervised Fine-Tuning

📅 2026-05-08

📈 Citations: 0

✨ Influential: 0

career value

168K/year

🤖 AI Summary

Although supervised fine-tuning enhances in-domain performance, it often degrades out-of-domain generalization due to detrimental rotations in the dominant singular subspace of pre-trained weights. This work proposes constraining the projection rotation of the top-k singular vector blocks to efficiently approximate loss-sensitive directions, thereby suppressing harmful parameter updates while adapting to downstream tasks. The method explicitly preserves critical representational structures at large model scales for the first time, leveraging projection rotation as an efficient proxy for Fisher information–sensitive directions. By integrating singular value decomposition, rotation-based regularization, and supervised fine-tuning, the approach consistently improves the trade-off between in-domain and out-of-domain performance across multiple model families and scales, better retains pre-trained knowledge, and yields superior initializations for downstream reinforcement learning.

📝 Abstract

Supervised fine-tuning (SFT) improves in-domain performance but can degrade out-of-domain (OOD) generalization. Prior work suggests that this degradation is related to changes in dominant singular subspaces of pretrained weight matrices. However, directly identifying loss-sensitive directions with Hessian or Fisher information is computationally expensive at LLM scale. In this work, we propose preserving projected rotations in pretrained singular subspaces as an efficient proxy for Fisher-sensitive directions, which we call Rotation-Preserving Supervised Fine-Tuning (RPSFT). RPSFT penalizes changes in the projected top-$k$ singular-vector block of each pretrained weight matrix, limiting unnecessary rotation while preserving task adaptation. Across model families and sizes trained on math reasoning data, RPSFT improves the in-domain/OOD trade-off over standard SFT and strong SFT baselines, better preserves pretrained representations, and provides stronger initializations for downstream RL fine-tuning. Code is available at \href{https://github.com/jinhangzhan/RPSFT.git}{https://github.com/jinhangzhan/RPSFT}.

Problem

Research questions and friction points this paper is trying to address.

supervised fine-tuning

out-of-domain generalization

pretrained representations

LLM adaptation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Rotation-Preserving SFT

singular subspaces

OOD generalization

Fisher-sensitive directions