Feature Contamination: Neural Networks Learn Uncorrelated Features and Fail to Generalize

📅 2024-06-05

🏛️ International Conference on Machine Learning

📈 Citations: 2

✨ Influential: 0

career value

217K/year

🤖 AI Summary

This paper addresses the sharp degradation in generalization performance of deep neural networks under distributional shift, identifying its root cause as “feature contamination”—the simultaneous learning of predictive and non-predictive (irrelevant or spurious) features during nonlinear representation learning, which undermines cross-distribution generalization. Diverging from dominant accounts centered on spurious correlations, we formally characterize this mechanism for the first time. Leveraging a two-layer ReLU network architecture and a structured feature model, we conduct theoretical analysis grounded in SGD optimization dynamics, complemented by empirical validation. Our theory proves that even when a student network perfectly recovers the generalizable representations of a teacher model, generalization failure persists due to feature contamination. Moreover, we rigorously derive necessary and sufficient conditions for contamination onset and establish tight bounds on achievable generalization performance under such contamination.

Technology Category

Application Category

📝 Abstract

Learning representations that generalize under distribution shifts is critical for building robust machine learning models. However, despite significant efforts in recent years, algorithmic advances in this direction have been limited. In this work, we seek to understand the fundamental difficulty of out-of-distribution generalization with deep neural networks. We first empirically show that perhaps surprisingly, even allowing a neural network to explicitly fit the representations obtained from a teacher network that can generalize out-of-distribution is insufficient for the generalization of the student network. Then, by a theoretical study of two-layer ReLU networks optimized by stochastic gradient descent (SGD) under a structured feature model, we identify a fundamental yet unexplored feature learning proclivity of neural networks, feature contamination: neural networks can learn uncorrelated features together with predictive features, resulting in generalization failure under distribution shifts. Notably, this mechanism essentially differs from the prevailing narrative in the literature that attributes the generalization failure to spurious correlations. Overall, our results offer new insights into the non-linear feature learning dynamics of neural networks and highlight the necessity of considering inductive biases in out-of-distribution generalization.

Problem

Research questions and friction points this paper is trying to address.

Neural networks learn uncorrelated features

Generalization failure under distribution shifts

Feature contamination in deep learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Neural network feature contamination

Structured feature model analysis

Non-linear feature learning dynamics

🔎 Similar Papers

Spurious Correlations in Machine Learning: A Survey