Context-Parametric Inversion: Why Instruction Finetuning Can Worsen Context Reliance

📅 2024-10-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper identifies a counterintuitive phenomenon in instruction tuning—“context–parameter reversal”: when user inputs conflict with the model’s pretrained knowledge, reliance on input context first increases then decreases, degrading context-following capability despite continued improvement on standard benchmarks. The phenomenon is systematically validated across diverse model families (Llama, Mistral, Pythia) and instruction-tuning datasets (TULU, Alpaca, UltraChat), confirming its cross-architectural and cross-dataset generality. Drawing on distributional bias in instruction data, the authors provide the first theoretical explanation and formal naming of this effect. Through controlled ablation studies and attribution analysis, they isolate key causal factors. Furthermore, they propose a targeted mitigation strategy that improves context-following rates in a limited yet interpretable manner—without sacrificing benchmark performance—thereby advancing the design of robust instruction-tuning paradigms.

Technology Category

Application Category

📝 Abstract
A standard practice when using large language models is for users to supplement their instruction with an input context containing new information for the model to process. However, models struggle to reliably follow the input context, especially when it conflicts with their parametric knowledge from pretraining. In-principle, one would expect models to adapt to the user context better after instruction finetuning, particularly when handling knowledge conflicts. However, we observe a surprising failure mode: during instruction tuning, the context reliance under knowledge conflicts initially increases as expected, but then gradually decreases as instruction finetuning progresses. This happens while the performance on standard benchmarks keeps on increasing far after this drop. We call this phenomenon context-parametric inversion and observe it across multiple general purpose instruction tuning datasets such as TULU, Alpaca and Ultrachat, across different model families like Llama, Mistral, and Pythia. We perform various controlled studies and theoretical analysis to show that context-parametric inversion occurs due to examples in the instruction finetuning data where the input context provides information that aligns with model's parametric knowledge. Our analysis suggests some natural mitigation strategies with limited but insightful gains, and serves as a useful starting point in addressing this deficiency in instruction finetuning.
Problem

Research questions and friction points this paper is trying to address.

Models fail to follow input context conflicting with pretrained knowledge
Instruction finetuning reduces context reliance despite benchmark improvements
Context-parametric inversion occurs across datasets and model families
Innovation

Methods, ideas, or system contributions that make the work stand out.

Instruction finetuning reduces context reliance
Analyzes context-parametric inversion phenomenon
Proposes mitigation strategies for knowledge conflicts
🔎 Similar Papers
No similar papers found.