🤖 AI Summary
Online AUC optimization faces fundamental challenges due to the non-convexity, discontinuity, and high memory overhead—O(T)—of the pairwise 0/1 loss. Method: We propose a novel second-order statistical surrogate loss that approximates the global pairwise risk directly via first- and second-order moments, eliminating per-sample pairwise computations within an online learning framework. The surrogate is hinge-like and compatible with both linear and kernelized nonlinear mappings. Contribution/Results: Our method achieves the first O(ln T) regret bound for online AUC maximization—significantly improving upon the prior best O(√T). It reduces memory complexity from O(T) to O(d), accelerates training by multiple-fold, and consistently attains superior AUC performance across multiple benchmark datasets.
📝 Abstract
The Area Under the Curve (AUC) is an important performance metric for classification tasks, particularly in class-imbalanced scenarios. However, minimizing the AUC presents significant challenges due to the non-convex and discontinuous nature of pairwise 0/1 losses, which are difficult to optimize, as well as the substantial memory cost of instance-wise storage, which creates bottlenecks in large-scale applications. To overcome these challenges, we propose a novel second-order surrogate loss based on the pairwise hinge loss, and develop an efficient online algorithm. Unlike conventional approaches that approximate each individual pairwise 0/1 loss term with an instance-wise surrogate function, our approach introduces a new paradigm that directly substitutes the entire aggregated pairwise loss with a surrogate loss function constructed from the first- and second-order statistics of the training data. Theoretically, while existing online AUC optimization algorithms typically achieve an $mathcal{O}(sqrt{T})$ regret bound, our method attains a tighter $mathcal{O}(ln T)$ bound. Furthermore, we extend the proposed framework to nonlinear settings through a kernel-based formulation. Extensive experiments on multiple benchmark datasets demonstrate the superior efficiency and effectiveness of the proposed second-order surrogate loss in optimizing online AUC performance.