An Analytical Model for Overparameterized Learning Under Class Imbalance

📅 2025-03-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work investigates the generalization behavior of linear classification under class imbalance in high-dimensional Gaussian mixture models within the overparameterized regime. We develop the first rigorous, analytically tractable closed-form approximation of the test error, derived via high-dimensional asymptotic analysis and random matrix theory. Our framework unifies the bias-correction mechanisms and delineates the precise applicability boundaries of calibration strategies—including logit adjustment and class-dependent temperature scaling. The theoretical analysis yields explicit analytical expressions for the optimal adjustment bias and temperature, revealing how these corrections mitigate the systematic bias of standard cross-entropy loss in imbalanced settings. Extensive validation on synthetic data and real-world imbalanced benchmarks (CIFAR-10, MNIST, Fashion-MNIST) demonstrates that our error approximation achieves absolute prediction errors below 2%, significantly outperforming existing empirical tuning approaches.

Technology Category

Application Category

📝 Abstract
We study class-imbalanced linear classification in a high-dimensional Gaussian mixture model. We develop a tight, closed form approximation for the test error of several practical learning methods, including logit adjustment and class dependent temperature. Our approximation allows us to analytically tune and compare these methods, highlighting how and when they overcome the pitfalls of standard cross-entropy minimization. We test our theoretical findings on simulated data and imbalanced CIFAR10, MNIST and FashionMNIST datasets.
Problem

Research questions and friction points this paper is trying to address.

Class-imbalanced linear classification in high-dimensional Gaussian mixture model.
Closed-form approximation for test error of practical learning methods.
Analytical tuning and comparison of methods to overcome cross-entropy pitfalls.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Closed form approximation for test error
Analytical tuning of learning methods
Comparison of methods overcoming cross-entropy pitfalls
🔎 Similar Papers
2024-05-15International Conference on Machine LearningCitations: 6
2024-07-16European Conference on Artificial IntelligenceCitations: 0
E
Eliav Mor
Department of Computer Science, Tel Aviv University
Yair Carmon
Yair Carmon
Tel Aviv University
Machine LearningOptimizationStatistics