🤖 AI Summary
Existing dataset distillation methods suffer from limited generalization under extreme compression ratios (e.g., IPC = 1/10), primarily due to constrained optimization over fixed latent spaces, which hinders effective exploitation of multi-level semantic features. To address this, we propose Hierarchical Parameterized Distillation (H-PD), the first framework enabling collaborative optimization across multiple layers of pre-trained GANs’ latent spaces. H-PD employs hierarchical, progressive parameterization to model features at varying granularities and introduces a class-aware feature distance metric to drastically reduce evaluation overhead. Experiments demonstrate that, under IPC = 1/10, H-PD surpasses state-of-the-art diffusion-based distillation methods—achieving SOTA performance across image classification and transfer learning tasks with comparable computational cost. This validates the effectiveness and practicality of systematically exploring hierarchical feature spaces for dataset distillation.
📝 Abstract
Dataset distillation is an emerging dataset reduction method, which condenses large-scale datasets while maintaining task accuracy. Current parameterization methods achieve enhanced performance under extremely high compression ratio by optimizing determined synthetic dataset in informative feature domain. However, they limit themselves to a fixed optimization space for distillation, neglecting the diverse guidance across different informative latent spaces. To overcome this limitation, we propose a novel parameterization method dubbed Hierarchical Parameterization Distillation (H-PD), to systematically explore hierarchical feature within provided feature space (e.g., layers within pre-trained generative adversarial networks). We verify the correctness of our insights by applying the hierarchical optimization strategy on GAN-based parameterization method. In addition, we introduce a novel class-relevant feature distance metric to alleviate the computational burden associated with synthetic dataset evaluation, bridging the gap between synthetic and original datasets. Experimental results demonstrate that the proposed H-PD achieves a significant performance improvement under various settings with equivalent time consumption, and even surpasses current generative distillation using diffusion models under extreme compression ratios IPC=1 and IPC=10.