🤖 AI Summary
Label noise severely degrades the performance of boosting methods; existing robust approaches, though theoretically grounded, exhibit poor adaptability to realistic mixed noise patterns and limited sample sizes, often compromising accuracy on clean data.
Method: We propose Robust Minimax Boosting, which optimizes for the worst-case misclassification probability. Our framework integrates statistical learning error bound analysis within a minimax optimization paradigm to enhance generalization robustness against heterogeneous label noise.
Contribution/Results: This work establishes, for the first time under finite-sample settings, theoretical guarantees that simultaneously ensure both clean-data error bounds and convergence to the Bayes risk. Empirical evaluations demonstrate that our method significantly outperforms state-of-the-art robust boosting algorithms across diverse noise scenarios—including symmetric, asymmetric, and instance-dependent noise—while maintaining high classification accuracy on noise-free data.
📝 Abstract
Boosting methods often achieve excellent classification accuracy, but can experience notable performance degradation in the presence of label noise. Existing robust methods for boosting provide theoretical robustness guarantees for certain types of label noise, and can exhibit only moderate performance degradation. However, previous theoretical results do not account for realistic types of noise and finite training sizes, and existing robust methods can provide unsatisfactory accuracies, even without noise. This paper presents methods for robust minimax boosting (RMBoost) that minimize worst-case error probabilities and are robust to general types of label noise. In addition, we provide finite-sample performance guarantees for RMBoost with respect to the error obtained without noise and with respect to the best possible error (Bayes risk). The experimental results corroborate that RMBoost is not only resilient to label noise but can also provide strong classification accuracy.