Hierarchical Features Matter: A Deep Exploration of Progressive Parameterization Method for Dataset Distillation

📅 2024-06-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing dataset distillation methods suffer from limited generalization under extreme compression ratios (e.g., IPC = 1/10), primarily due to constrained optimization over fixed latent spaces, which hinders effective exploitation of multi-level semantic features. To address this, we propose Hierarchical Parameterized Distillation (H-PD), the first framework enabling collaborative optimization across multiple layers of pre-trained GANs’ latent spaces. H-PD employs hierarchical, progressive parameterization to model features at varying granularities and introduces a class-aware feature distance metric to drastically reduce evaluation overhead. Experiments demonstrate that, under IPC = 1/10, H-PD surpasses state-of-the-art diffusion-based distillation methods—achieving SOTA performance across image classification and transfer learning tasks with comparable computational cost. This validates the effectiveness and practicality of systematically exploring hierarchical feature spaces for dataset distillation.

Technology Category

Application Category

📝 Abstract
Dataset distillation is an emerging dataset reduction method, which condenses large-scale datasets while maintaining task accuracy. Current parameterization methods achieve enhanced performance under extremely high compression ratio by optimizing determined synthetic dataset in informative feature domain. However, they limit themselves to a fixed optimization space for distillation, neglecting the diverse guidance across different informative latent spaces. To overcome this limitation, we propose a novel parameterization method dubbed Hierarchical Parameterization Distillation (H-PD), to systematically explore hierarchical feature within provided feature space (e.g., layers within pre-trained generative adversarial networks). We verify the correctness of our insights by applying the hierarchical optimization strategy on GAN-based parameterization method. In addition, we introduce a novel class-relevant feature distance metric to alleviate the computational burden associated with synthetic dataset evaluation, bridging the gap between synthetic and original datasets. Experimental results demonstrate that the proposed H-PD achieves a significant performance improvement under various settings with equivalent time consumption, and even surpasses current generative distillation using diffusion models under extreme compression ratios IPC=1 and IPC=10.
Problem

Research questions and friction points this paper is trying to address.

Overcomes fixed optimization space limitation in dataset distillation.
Proposes Hierarchical Parameterization Distillation for hierarchical feature exploration.
Introduces class-relevant feature distance metric to reduce computational burden.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical Parameterization Distillation (H-PD) method
Class-relevant feature distance metric
Optimization in hierarchical feature spaces
🔎 Similar Papers
No similar papers found.
Xinhao Zhong
Xinhao Zhong
Harbin Institute of Technology, Shenzhen
Data-centric AIEffiecient AI
H
Hao Fang
Tsinghua University
B
Bin Chen
Harbin Institute of Technology, Shenzhen, Peng Cheng Laboratory
Xulin Gu
Xulin Gu
Harbin Institute of Technology, Shenzhen
dataset distillation
M
Meikang Qiu
Computer and Cyber Sci., Augusta University, Augusta, GA, USA
Shu-Tao Xia
Shu-Tao Xia
SIGS, Tsinghua University
coding and information theorymachine learningcomputer visionAI security