On the Role of Label Noise in the Feature Learning Process

📅 2025-05-25

📈 Citations: 0

✨ Influential: 0

career value

203K/year

🤖 AI Summary

Deep neural networks suffer from degraded generalization under label noise. Method: This work establishes a theoretical framework for feature learning under label noise, modeling data distributions via signal-noise separation and rigorously analyzing the two-phase training dynamics of two-layer convolutional networks. Contribution/Results: We prove that (i) in early training, the model focuses on clean samples and learns robust, generalizable features; (ii) in later stages, gradients along noisy directions dominate, inducing feature overfitting and generalization collapse. This analysis formally explains the efficacy of early stopping and noise-aware sample selection. Empirical validation on both synthetic and real-world benchmarks confirms the predicted feature degradation trajectory induced by label noise, providing an interpretable, theoretically grounded foundation for robust deep learning.

Technology Category

Application Category

📝 Abstract

Deep learning with noisy labels presents significant challenges. In this work, we theoretically characterize the role of label noise from a feature learning perspective. Specifically, we consider a signal-noise data distribution, where each sample comprises a label-dependent signal and label-independent noise, and rigorously analyze the training dynamics of a two-layer convolutional neural network under this data setup, along with the presence of label noise. Our analysis identifies two key stages. In Stage I, the model perfectly fits all the clean samples (i.e., samples without label noise) while ignoring the noisy ones (i.e., samples with noisy labels). During this stage, the model learns the signal from the clean samples, which generalizes well on unseen data. In Stage II, as the training loss converges, the gradient in the direction of noise surpasses that of the signal, leading to overfitting on noisy samples. Eventually, the model memorizes the noise present in the noisy samples and degrades its generalization ability. Furthermore, our analysis provides a theoretical basis for two widely used techniques for tackling label noise: early stopping and sample selection. Experiments on both synthetic and real-world setups validate our theory.

Problem

Research questions and friction points this paper is trying to address.

Theoretical analysis of label noise impact on feature learning

Examines training dynamics in noisy-label convolutional networks

Identifies stages leading to noise memorization and overfitting

Innovation

Methods, ideas, or system contributions that make the work stand out.

Analyzes label noise impact on feature learning

Uses two-layer CNN for noisy label dynamics

Proposes early stopping and sample selection

🔎 Similar Papers

Spurious Correlations in Machine Learning: A Survey