Subspace Control: Turning Constrained Model Steering into Controllable Spectral Optimization

📅 2026-04-05

📈 Citations: 0

✨ Influential: 0

career value

196K/year

🤖 AI Summary

This work addresses the challenge of interference between primary objectives and constraint objectives when adapting foundation models under safety, privacy, or task-specific constraints. To resolve this issue, the authors propose SIFT (Spectral Interference-Free Training), a novel framework that uniquely integrates subspace orthogonalization from model merging with gradient orthogonalization. By analyzing cross-task interference in the spectral domain and introducing a localized intervention mechanism, SIFT enables selective and interference-free constrained optimization. Evaluated across four diverse tasks—machine unlearning, safety alignment, voice synthesis adaptation, and hallucination mitigation—SIFT consistently outperforms both constrained and unconstrained baselines, demonstrating robust and generalizable performance gains.

Technology Category

Application Category

📝 Abstract

Foundation models, such as large language models (LLMs), are powerful but often require customization before deployment to satisfy practical constraints such as safety, privacy, and task-specific requirements, leading to "constrained" optimization problems for model steering and adaptation. However, solving such problems remains largely underexplored and is particularly challenging due to interference between the primary objective and constraint objectives during optimization. In this paper, we propose a subspace control framework for constrained model training. Specifically, (i) we first analyze, from a model merging perspective, how spectral cross-task interference arises and show that it can be resolved via a one-shot solution that orthogonalizes the merged subspace; (ii) we establish a connection between this solution and gradient orthogonalization in the spectral optimizer Muon; and (iii) building on these insights, we introduce SIFT (spectral interference-free training), which leverages a localization scheme to selectively intervene during optimization, enabling controllable updates that mitigate objective-constraint conflicts. We evaluate SIFT across four representative applications: (a) machine unlearning, (b) safety alignment, (c) text-to-speech adaptation, and (d) hallucination mitigation. Compared to both control-based and control-free baselines, SIFT consistently achieves substantial and robust performance improvements across all tasks. Code is available at https://github.com/OPTML-Group/SIFT.

Problem

Research questions and friction points this paper is trying to address.

constrained optimization

model steering

spectral interference

foundation models

objective-constraint conflict

Innovation

Methods, ideas, or system contributions that make the work stand out.

subspace control

spectral optimization

gradient orthogonalization