🤖 AI Summary
To address class imbalance in long-tailed multi-class classification—where existing methods (e.g., resampling, cost-sensitive learning, loss modification) lack theoretical grounding and fail to satisfy Bayes consistency—this paper proposes the first learning framework that is both theoretically rigorous and practically effective. Our method introduces: (1) a strongly H-consistent margin-based loss function tailored for class-imbalanced settings; (2) a class-sensitive Rademacher complexity theory, yielding tight generalization error bounds; and (3) the IMMAX algorithm, enabling margin-driven, hypothesis-class-agnostic robust optimization. We provide formal theoretical guarantees establishing strong generalization performance. Empirically, IMMAX achieves statistically significant improvements over state-of-the-art methods across multiple benchmark datasets, validating both its effectiveness and superior generalization capability in long-tailed classification.
📝 Abstract
Class imbalance remains a major challenge in machine learning, especially in multi-class problems with long-tailed distributions. Existing methods, such as data resampling, cost-sensitive techniques, and logistic loss modifications, though popular and often effective, lack solid theoretical foundations. As an example, we demonstrate that cost-sensitive methods are not Bayes consistent. This paper introduces a novel theoretical framework for analyzing generalization in imbalanced classification. We propose a new class-imbalanced margin loss function for both binary and multi-class settings, prove its strong $H$-consistency, and derive corresponding learning guarantees based on empirical loss and a new notion of class-sensitive Rademacher complexity. Leveraging these theoretical results, we devise novel and general learning algorithms, IMMAX (Imbalanced Margin Maximization), which incorporate confidence margins and are applicable to various hypothesis sets. While our focus is theoretical, we also present extensive empirical results demonstrating the effectiveness of our algorithms compared to existing baselines.