π€ AI Summary
This study investigates the mechanistic impact of supervised fine-tuning (SFT) on the knowledge structure of large language models (LLMs). Addressing the lack of clarity and controllability in SFT-induced knowledge evolution, we systematically analyze performance changes across varying fine-tuning scales and knowledge mastery levels on LLaMA-2/3 models, using closed-book question answering as the evaluation task. We combine token-level output analysis with parameter-level update tracking to characterize knowledge dynamics. Our findings reveal that up to 90% of parameter updates during SFT contribute negligibly to knowledge enhancement. Crucially, selectively restoring only the most critical updates yields substantial gainsβfew-shot fine-tuning outperforms full-data tuning by 14%, while fluctuations in knowledge mastery induce over 12% performance variance. This work is the first to uncover the non-uniform coupling between parameter updates and knowledge evolution in SFT, establishing a novel paradigm for controllable, targeted knowledge editing in LLMs.
π Abstract
Large language models (LLMs) acquire substantial world knowledge during pre-training, which is further shaped by post-training techniques such as supervised fine-tuning (SFT). However, the impact of SFT on a model's knowledge remains underexplored, limiting our ability to control knowledge change behavior in fine-tuned models. To address this gap, we evaluate closed-book question answering (CBQA) performance across five LLMs from the LLaMA-2 and LLaMA-3 families. Surprisingly, models fine-tuned on 1,920 samples perform up to 14% worse than those fine-tuned on only 240 samples. Furthermore, varying the level of knowledge mastery in the fine-tuning data leads to performance fluctuations of over 12%. To investigate these effects, we analyze model behavior at both the token and parameter levels. Our analysis reveals that up to 90% of parameter updates during SFT do not contribute to knowledge enhancement. Restoring these updates can improve performance on the CBQA task, depending on the characteristics of the fine-tuning data. These insights offer practical guidance for developing fine-tuning strategies that more effectively strengthen model knowledge.