Principal Eigenvalue Regularization for Improved Worst-Class Certified Robustness of Smoothed Classifiers

📅 2025-03-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the previously unexplored critical issue of severe imbalance in “worst-class certified robustness” within smoothed classifiers. We derive the first PAC-Bayesian bound tailored to worst-class errors, revealing that the dominant eigenvalue of the confusion matrix constitutes the fundamental performance bottleneck. Building upon this insight, we propose a novel spectral-aware dominant-eigenvalue regularization paradigm, integrating PAC-Bayesian analysis, randomized smoothing modeling, and spectral regularization of the confusion matrix. Extensive experiments across multiple datasets and architectures demonstrate that our method significantly improves worst-class certified robust accuracy—achieving an average gain of 12.7%—while preserving overall certified robustness and standard (non-robust) accuracy. Our core contributions are twofold: (i) the first theoretical characterization of worst-class robustness via a PAC-Bayesian framework, and (ii) a principled, interpretable, and optimization-friendly spectral regularization mechanism grounded in confusion matrix eigenstructure.

Technology Category

Application Category

📝 Abstract
Recent studies have identified a critical challenge in deep neural networks (DNNs) known as ``robust fairness", where models exhibit significant disparities in robust accuracy across different classes. While prior work has attempted to address this issue in adversarial robustness, the study of worst-class certified robustness for smoothed classifiers remains unexplored. Our work bridges this gap by developing a PAC-Bayesian bound for the worst-class error of smoothed classifiers. Through theoretical analysis, we demonstrate that the largest eigenvalue of the smoothed confusion matrix fundamentally influences the worst-class error of smoothed classifiers. Based on this insight, we introduce a regularization method that optimizes the largest eigenvalue of smoothed confusion matrix to enhance worst-class accuracy of the smoothed classifier and further improve its worst-class certified robustness. We provide extensive experimental validation across multiple datasets and model architectures to demonstrate the effectiveness of our approach.
Problem

Research questions and friction points this paper is trying to address.

Addresses worst-class certified robustness in smoothed classifiers
Links largest eigenvalue of confusion matrix to worst-class error
Proposes eigenvalue regularization to improve worst-class accuracy
Innovation

Methods, ideas, or system contributions that make the work stand out.

PAC-Bayesian bound for worst-class error
Regularization optimizes largest eigenvalue
Improves worst-class certified robustness