FERD: Fairness-Enhanced Data-Free Robustness Distillation

📅 2025-09-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing data-free robust distillation methods (e.g., DFRD) neglect inter-class robustness fairness, resulting in significant disparities in student model robustness across classes and attack types. To address this, we propose the first data-free distillation framework explicitly designed to achieve class-wise robustness equity. Our method comprises two key components: (1) a robustness-guided class reweighting strategy that dynamically increases the sampling weight for vulnerable classes during adversarial example generation; and (2) a fairness-aware adversarial generation mechanism that jointly optimizes uniformity constraints on feature-layer predictions and unified target attacks to produce more balanced target adversarial examples. Evaluated on CIFAR-10, our approach improves the worst-class robust accuracy of a MobileNet-V2 student model by 15.1% under FGSM and 6.4% under AutoAttack, respectively—demonstrating substantial gains in both robustness fairness and stability.

Technology Category

Application Category

📝 Abstract
Data-Free Robustness Distillation (DFRD) aims to transfer the robustness from the teacher to the student without accessing the training data. While existing methods focus on overall robustness, they overlook the robust fairness issues, leading to severe disparity of robustness across different categories. In this paper, we find two key problems: (1) student model distilled with equal class proportion data behaves significantly different across distinct categories; and (2) the robustness of student model is not stable across different attacks target. To bridge these gaps, we present the first Fairness-Enhanced data-free Robustness Distillation (FERD) framework to adjust the proportion and distribution of adversarial examples. For the proportion, FERD adopts a robustness-guided class reweighting strategy to synthesize more samples for the less robust categories, thereby improving robustness of them. For the distribution, FERD generates complementary data samples for advanced robustness distillation. It generates Fairness-Aware Examples (FAEs) by enforcing a uniformity constraint on feature-level predictions, which suppress the dominance of class-specific non-robust features, providing a more balanced representation across all categories. Then, FERD constructs Uniform-Target Adversarial Examples (UTAEs) from FAEs by applying a uniform target class constraint to avoid biased attack directions, which distribute the attack targets across all categories and prevents overfitting to specific vulnerable categories. Extensive experiments on three public datasets show that FERD achieves state-of-the-art worst-class robustness under all adversarial attack (e.g., the worst-class robustness under FGSM and AutoAttack are improved by 15.1% and 6.4% using MobileNet-V2 on CIFAR-10), demonstrating superior performance in both robustness and fairness aspects.
Problem

Research questions and friction points this paper is trying to address.

Addresses robust fairness disparity across categories in data-free distillation
Solves unstable robustness across different adversarial attack targets
Improves worst-class robustness by balancing adversarial example distribution
Innovation

Methods, ideas, or system contributions that make the work stand out.

Robustness-guided class reweighting for sample synthesis
Generating fairness-aware examples with uniformity constraints
Constructing uniform-target adversarial examples to balance attacks
🔎 Similar Papers
No similar papers found.
Z
Zhengxiao Li
Nanjing University of Science and Technology
L
Liming Lu
Nanjing University of Science and Technology
X
Xu Zheng
HKUST(GZ), INSAIT, Sofia University, St. Kliment Ohridski
Siyuan Liang
Siyuan Liang
College of Computing and Data Science, Nanyang Technological University
Trustworthy Foundation Model
Z
Zhenghan Chen
STCA, Microsoft
Y
Yongbin Zhou
Nanjing University of Science and Technology
Shuchao Pang
Shuchao Pang
University of New South Wales
Medical image analysisdeep learning