FD$^2$: A Dedicated Framework for Fine-Grained Dataset Distillation

📅 2026-03-26

📈 Citations: 0

✨ Influential: 0

career value

170K/year

🤖 AI Summary

Existing decoupled dataset distillation methods suffer from insufficient intra-class diversity and weak inter-class discriminability in fine-grained scenarios due to their reliance on coarse-grained labels. To address this limitation, this work proposes FD², the first framework specifically designed for fine-grained dataset distillation. FD² leverages a counterfactual attention mechanism to localize discriminative regions, dynamically updates class prototypes, and incorporates fine-grained feature alignment along with an intra-class diversity constraint during distillation. This approach effectively preserves local discriminative cues while significantly enhancing both the discriminability and diversity of distilled samples. Extensive experiments demonstrate that FD² consistently outperforms existing decoupled distillation methods across multiple fine-grained and general-purpose datasets, exhibiting strong compatibility and generalization capability.

Technology Category

Application Category

📝 Abstract

Dataset distillation (DD) compresses a large training set into a small synthetic set, reducing storage and training cost, and has shown strong results on general benchmarks. Decoupled DD further improves efficiency by splitting the pipeline into pretraining, sample distillation, and soft-label generation. However, existing decoupled methods largely rely on coarse class-label supervision and optimize samples within each class in a nearly identical manner. On fine-grained datasets, this often yields distilled samples that (i) retain large intra-class variation with subtle inter-class differences and (ii) become overly similar within the same class, limiting localized discriminative cues and hurting recognition. To solve the above-mentioned problems, we propose FD$^{2}$, a dedicated framework for Fine-grained Dataset Distillation. FD$^{2}$ localizes discriminative regions and constructs fine-grained representations for distillation. During pretraining, counterfactual attention learning aggregates discriminative representations to update class prototypes. During distillation, a fine-grained characteristic constraint aligns each sample with its class prototype while repelling others, and a similarity constraint diversifies attention across same-class samples. Experiments on multiple fine-grained and general datasets show that FD$^{2}$ integrates seamlessly with decoupled DD and improves performance in most settings, indicating strong transferability.

Problem

Research questions and friction points this paper is trying to address.

fine-grained dataset distillation

dataset distillation

intra-class variation

inter-class differences

discriminative cues

Innovation

Methods, ideas, or system contributions that make the work stand out.

fine-grained dataset distillation

counterfactual attention learning

class prototype alignment