Path-Guided Flow Matching for Dataset Distillation

📅 2026-02-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing diffusion-based dataset distillation methods rely on heuristic guidance or prototype assignment, leading to low sampling efficiency, unstable trajectories, and poor generalization. This work proposes Path-Guided Flow Matching (PGFM), a novel framework that introduces flow matching into generative dataset distillation for the first time. Operating in the latent space of a frozen VAE, PGFM enables rapid deterministic synthesis with only a few ODE steps and incorporates a continuous path-to-prototype guidance mechanism that balances diversity, efficiency, and stability. Experimental results demonstrate that PGFM achieves competitive performance on high-resolution benchmarks, surpassing existing diffusion-based approaches with a 7.6× faster sampling speed and a mode coverage rate of 78%.

Technology Category

Application Category

📝 Abstract
Dataset distillation compresses large datasets into compact synthetic sets with comparable performance in training models. Despite recent progress on diffusion-based distillation, this type of method typically depends on heuristic guidance or prototype assignment, which comes with time-consuming sampling and trajectory instability and thus hurts downstream generalization especially under strong control or low IPC. We propose \emph{Path-Guided Flow Matching (PGFM)}, the first flow matching-based framework for generative distillation, which enables fast deterministic synthesis by solving an ODE in a few steps. PGFM conducts flow matching in the latent space of a frozen VAE to learn class-conditional transport from Gaussian noise to data distribution. Particularly, we develop a continuous path-to-prototype guidance algorithm for ODE-consistent path control, which allows trajectories to reliably land on assigned prototypes while preserving diversity and efficiency. Extensive experiments across high-resolution benchmarks demonstrate that PGFM matches or surpasses prior diffusion-based distillation approaches with fewer steps of sampling while delivering competitive performance with remarkably improved efficiency, e.g., 7.6$\times$ more efficient than the diffusion-based counterparts with 78\% mode coverage.
Problem

Research questions and friction points this paper is trying to address.

dataset distillation
diffusion-based distillation
trajectory instability
heuristic guidance
low IPC
Innovation

Methods, ideas, or system contributions that make the work stand out.

flow matching
dataset distillation
path-guided synthesis
latent ODE
prototype guidance
🔎 Similar Papers
No similar papers found.
X
Xuhui Li
Department of Machine Learning, MBZUAI, Abu Dhabi, UAE
Z
Zhengquan Luo
Department of Machine Learning, MBZUAI, Abu Dhabi, UAE
X
Xiwei Liu
Department of Computer Vision, MBZUAI, Abu Dhabi, UAE
Y
Yongqiang Yu
Department of Computer Vision, MBZUAI, Abu Dhabi, UAE
Zhiqiang Xu
Zhiqiang Xu
Professor, Academy of Math. And Sys. Sciences, Chinese Academy of Science
approximation theorycompressed sensingsplinesframe theoryquantization