Generalization vs. Specialization under Concept Shift

📅 2024-09-23
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work investigates the impact of concept shift—temporal variation in the input-label mapping—on the generalization performance of machine learning models. To address this form of distributional shift, the authors develop a high-dimensional asymptotic framework and derive, for the first time, a closed-form expression for the prediction risk of ridge regression under concept drift. Key theoretical findings are: (1) concept drift induces data-dependent, non-monotonic generalization curves—even in settings devoid of double descent; and (2) robust and non-robust features contribute to test error in opposing directions and with heterogeneous magnitudes. Methodologically, the study integrates high-dimensional statistical analysis, formal concept drift modeling, and rigorous theoretical derivation. Empirical validation on MNIST and FashionMNIST confirms the ubiquity of non-monotonic generalization behavior and demonstrates strong agreement between theoretical predictions and experimental results. This work advances the understanding of out-of-distribution generalization failure mechanisms and provides interpretable, theoretically grounded tools for enhancing model robustness.

Technology Category

Application Category

📝 Abstract
Machine learning models are often brittle under distribution shift, i.e., when data distributions at test time differ from those during training. Understanding this failure mode is central to identifying and mitigating safety risks of mass adoption of machine learning. Here we analyze ridge regression under concept shift -- a form of distribution shift in which the input-label relationship changes at test time. We derive an exact expression for prediction risk in the high-dimensional limit. Our results reveal nontrivial effects of concept shift on generalization performance, depending on the properties of robust and nonrobust features of the input. We show that test performance can exhibit a nonmonotonic data dependence, even when double descent is absent. Finally, our experiments on MNIST and FashionMNIST suggest that this intriguing behavior is present also in classification problems.
Problem

Research questions and friction points this paper is trying to address.

Analyzes ridge regression under concept shift
Examines effects of concept shift on generalization
Investigates performance in classification problems
Innovation

Methods, ideas, or system contributions that make the work stand out.

Analyze ridge regression under concept shift
Derive exact expression for prediction risk
Investigate transformers under concept shift
🔎 Similar Papers
No similar papers found.
A
Alex Nguyen
Princeton University
D
David J. Schwab
CUNY Graduate Center
V
Vudtiwat Ngampruetikorn
University of Sydney