Dataset Distillation via Relative Distribution Matching and Cognitive Heritage

📅 2026-02-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the high computational and memory costs of conventional dataset distillation when applied to self-supervised pre-trained models by proposing an efficient distillation framework based on statistical flow matching. The method optimizes synthetic images through statistical flows between class centers in the original data and incorporates single-step data augmentation, a linear projector, and a strategy for reusing the pre-trained classifier to substantially reduce resource consumption. Experimental results demonstrate that the proposed approach achieves performance on par with or superior to state-of-the-art methods while reducing GPU memory usage by a factor of ten and accelerating runtime by fourfold.

Technology Category

Application Category

📝 Abstract
Dataset distillation seeks to synthesize a highly compact dataset that achieves performance comparable to the original dataset on downstream tasks. For the classification task that use pre-trained self-supervised models as backbones, previous linear gradient matching optimizes synthetic images by encouraging them to mimic the gradient updates induced by real images on the linear classifier. However, this batch-level formulation requires loading thousands of real images and applying multiple rounds of differentiable augmentations to synthetic images at each distillation step, leading to substantial computational and memory overhead. In this paper, we introduce statistical flow matching , a stable and efficient supervised learning framework that optimizes synthetic images by aligning constant statistical flows from target class centers to non-target class centers in the original data. Our approach loads raw statistics only once and performs a single augmentation pass on the synthetic data, achieving performance comparable to or better than the state-of-the-art methods with 10x lower GPU memory usage and 4x shorter runtime. Furthermore, we propose a classifier inheritance strategy that reuses the classifier trained on the original dataset for inference, requiring only an extremely lightweight linear projector and marginal storage while achieving substantial performance gains.
Problem

Research questions and friction points this paper is trying to address.

Dataset Distillation
Computational Overhead
Memory Efficiency
Self-supervised Models
Classification Task
Innovation

Methods, ideas, or system contributions that make the work stand out.

statistical flow matching
dataset distillation
classifier inheritance
relative distribution matching
efficient synthetic data
🔎 Similar Papers
Q
Qianxin Xia
Department of XXX, University of YYY , Location, Country
Jiawei Du
Jiawei Du
National Taiwan University; ex-Intern @ Samsung Research
Speech processingNeural codingGenerative AIAI security
Y
Yuhan Zhang
School of ZZZ, Institute of WWW, Location, Country
J
Jielei Wang
Department of XXX, University of YYY , Location, Country
G
Guoming Lu
Department of XXX, University of YYY , Location, Country