Mind Your Margin and Boundary: Are Your Distilled Datasets Truly Robust?

📅 2026-05-19
📈 Citations: 0
Influential: 0
📄 PDF

career value

206K/year
🤖 AI Summary
Existing dataset distillation methods struggle to simultaneously maintain clean accuracy and adversarial robustness, often overlooking small-margin samples and inter-class boundary separation—factors critically influencing robust error. To address this, this work proposes the C²R framework, which introduces, for the first time, a perturbation scoring mechanism grounded in robust decision boundaries. C²R employs an attack-aware curriculum learning strategy to prioritize optimization of critical adversarial examples and incorporates a class-balanced contrastive robust loss to explicitly enlarge inter-class margins. Evaluated on CIFAR-10/100, Tiny-ImageNet, and multiple ImageNet-1K subsets against six diverse attacks, C²R achieves an average robust accuracy improvement of 2.8% over the previous state-of-the-art, substantially advancing the trade-off between standard accuracy and adversarial robustness.
📝 Abstract
Dataset distillation (DD) compresses a large training set into a small synthetic set for efficient training, but most DD methods optimize only clean accuracy and leave robustness uncontrolled. Recent robust DD methods improve robustness, yet they often suffer from a poor accuracy-robustness trade-off because they (i) treat all adversarially perturbed examples uniformly, despite robust risk being dominated by near-zero robust margins, and (ii) do not explicitly increase inter-class separation in the decision boundary where attacks concentrate. We present Contrastive Curriculum for Robust Dataset Distillation (C$^2$R), a framework that couples an attack-aware curriculum with a contrastive robustness objective. From a robust-margin perspective, we derive a perturbation score that approximates each sample's robust hinge, enabling a curriculum that prioritizes the smallest-margin adversaries that most directly drive robust error. In parallel, a class-balanced contrastive robustness loss enforces adversarial invariance while explicitly widening boundary separation across classes. Experiments on CIFAR-10/100, Tiny-ImageNet, and multiple ImageNet-1K subsets under six attacks show that C$^2$R achieves the best robust accuracy, outperforming prior robust DD by $2.8$% on average.
Problem

Research questions and friction points this paper is trying to address.

dataset distillation
robustness
adversarial examples
decision boundary
accuracy-robustness trade-off
Innovation

Methods, ideas, or system contributions that make the work stand out.

robust dataset distillation
robust margin
contrastive robustness
attack-aware curriculum
decision boundary separation
🔎 Similar Papers
No similar papers found.
M
Muquan Li
The Laboratory of Intelligent Collaborative Computing of UESTC, Chengdu, China
Y
Yingyi Ma
The Laboratory of Intelligent Collaborative Computing of UESTC, Chengdu, China
Y
Yihong Huang
The Laboratory of Intelligent Collaborative Computing of UESTC, Chengdu, China
H
Hang Gou
The Laboratory of Intelligent Collaborative Computing of UESTC, Chengdu, China
K
Ke Qin
The Laboratory of Intelligent Collaborative Computing of UESTC, Chengdu, China
M
Ming Li
Guangdong Laboratory of Artificial Intelligence and Digital Economy (SZ), Shenzhen, China
Yuan-Fang Li
Yuan-Fang Li
Oracle | Monash University
Large language modelKnowledge graphsnatural language processing
Tao He
Tao He
UESTC
Image RetrievalComputer Vision