Understanding Overadaptation in Supervised Fine-Tuning: The Role of Ensemble Methods

📅 2025-06-02

📈 Citations: 0

✨ Influential: 0

career value

175K/year

🤖 AI Summary

This work addresses “over-adaptation” in supervised fine-tuning (SFT) of language models—where task-specific performance improves at the expense of severe forgetting of pretraining-acquired general knowledge. To mitigate this, we systematically demonstrate, for the first time in LMs, that simple ensembling of pretrained and fine-tuned models not only restores general capabilities but also significantly outperforms the fine-tuned model alone on downstream tasks. Building upon an overparameterized linear model, we develop a bias–variance theoretical framework that provides the first interpretable statistical account of over-adaptation. We empirically validate our theory via weight interpolation and bias–variance decomposition. Experiments align closely with theoretical predictions, revealing that model ensembling jointly optimizes bias and variance—thereby reconciling generalization ability with task specificity.

Technology Category

Application Category

📝 Abstract

Supervised fine-tuning (SFT) on domain-specific data is the dominant approach for adapting foundation models to specialized tasks. However, it has been observed that SFT models tend to forget knowledge acquired during pretraining. In vision models, ensembling a pretrained model with its fine-tuned counterpart has been shown to mitigate this issue. In this work, we demonstrate that the same holds for language models, and, more strikingly, we observe an overadaptation phenomenon: the ensemble model not only retains general knowledge from the foundation model but also outperforms the fine-tuned model even on the fine-tuning domain itself. Despite the empirical success of ensembling, a theoretical understanding of its benefits remains underexplored. We develop a formal theoretical analysis of the overadaptation phenomenon. Ensembling mitigates this by balancing two primary sources of error: bias, caused by insufficient fine-tuning, and variance, introduced by overfitting to fine-tuning data. While regularization techniques aim to address this trade-off, we show that ensembling provides a more effective solution. We analyze this phenomenon in over-parameterized linear settings and demonstrate that interpolating between pretrained and fine-tuned weights significantly improves performance. These findings offer theoretical justification for the observed advantages of model ensembling, supported by empirical experiments consistent with our analysis.

Problem

Research questions and friction points this paper is trying to address.

Investigates overadaptation in supervised fine-tuning of language models

Explores ensemble methods to mitigate knowledge forgetting in fine-tuned models

Analyzes bias-variance trade-off in ensemble models theoretically and empirically

Innovation

Methods, ideas, or system contributions that make the work stand out.

Ensemble pretrained and fine-tuned models

Balance bias and variance errors

Interpolate weights for better performance

🔎 Similar Papers

Revisiting Prefix-tuning: Statistical Benefits of Reparameterization among Prompts