🤖 AI Summary
This work addresses the previously unexplored critical issue of severe imbalance in “worst-class certified robustness” within smoothed classifiers. We derive the first PAC-Bayesian bound tailored to worst-class errors, revealing that the dominant eigenvalue of the confusion matrix constitutes the fundamental performance bottleneck. Building upon this insight, we propose a novel spectral-aware dominant-eigenvalue regularization paradigm, integrating PAC-Bayesian analysis, randomized smoothing modeling, and spectral regularization of the confusion matrix. Extensive experiments across multiple datasets and architectures demonstrate that our method significantly improves worst-class certified robust accuracy—achieving an average gain of 12.7%—while preserving overall certified robustness and standard (non-robust) accuracy. Our core contributions are twofold: (i) the first theoretical characterization of worst-class robustness via a PAC-Bayesian framework, and (ii) a principled, interpretable, and optimization-friendly spectral regularization mechanism grounded in confusion matrix eigenstructure.
📝 Abstract
Recent studies have identified a critical challenge in deep neural networks (DNNs) known as ``robust fairness", where models exhibit significant disparities in robust accuracy across different classes. While prior work has attempted to address this issue in adversarial robustness, the study of worst-class certified robustness for smoothed classifiers remains unexplored. Our work bridges this gap by developing a PAC-Bayesian bound for the worst-class error of smoothed classifiers. Through theoretical analysis, we demonstrate that the largest eigenvalue of the smoothed confusion matrix fundamentally influences the worst-class error of smoothed classifiers. Based on this insight, we introduce a regularization method that optimizes the largest eigenvalue of smoothed confusion matrix to enhance worst-class accuracy of the smoothed classifier and further improve its worst-class certified robustness. We provide extensive experimental validation across multiple datasets and model architectures to demonstrate the effectiveness of our approach.