FAST: Topology-Aware Frequency-Domain Distribution Matching for Coreset Selection

📅 2025-11-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing coreset selection methods suffer from strong model dependency, lack of theoretical guarantees, and inability to explicitly enforce distributional equivalence; moreover, conventional metrics fail to capture higher-order moment discrepancies. This paper proposes the first model-agnostic, frequency-domain coreset selection framework: it introduces the Characteristic Function Distance (CFD) into coreset construction for the first time, designs a decaying-phase-decoupled CFD metric and an asymptotic-difference-aware sampling strategy to enable staged distribution matching—from low- to high-frequency components; and integrates spectral graph theory to formulate a graph-structured constrained optimization model, ensuring global distributional consistency and local structural fidelity. Evaluated on multiple benchmarks, the method achieves an average accuracy improvement of 9.12%, 96.57% reduction in power consumption, and 2.2× acceleration in training speed—significantly outperforming state-of-the-art approaches.

Technology Category

Application Category

📝 Abstract
Coreset selection compresses large datasets into compact, representative subsets, reducing the energy and computational burden of training deep neural networks. Existing methods are either: (i) DNN-based, which are tied to model-specific parameters and introduce architectural bias; or (ii) DNN-free, which rely on heuristics lacking theoretical guarantees. Neither approach explicitly constrains distributional equivalence, largely because continuous distribution matching is considered inapplicable to discrete sampling. Moreover, prevalent metrics (e.g., MSE, KL, MMD, CE) cannot accurately capture higher-order moment discrepancies, leading to suboptimal coresets. In this work, we propose FAST, the first DNN-free distribution-matching coreset selection framework that formulates the coreset selection task as a graph-constrained optimization problem grounded in spectral graph theory and employs the Characteristic Function Distance (CFD) to capture full distributional information in the frequency domain. We further discover that naive CFD suffers from a "vanishing phase gradient" issue in medium and high-frequency regions; to address this, we introduce an Attenuated Phase-Decoupled CFD. Furthermore, for better convergence, we design a Progressive Discrepancy-Aware Sampling strategy that progressively schedules frequency selection from low to high, preserving global structure before refining local details and enabling accurate matching with fewer frequencies while avoiding overfitting. Extensive experiments demonstrate that FAST significantly outperforms state-of-the-art coreset selection methods across all evaluated benchmarks, achieving an average accuracy gain of 9.12%. Compared to other baseline coreset methods, it reduces power consumption by 96.57% and achieves a 2.2x average speedup, underscoring its high performance and energy efficiency.
Problem

Research questions and friction points this paper is trying to address.

Selecting representative subsets from large datasets without architectural bias
Capturing full distributional information using frequency-domain matching
Addressing vanishing phase gradient issues in characteristic function distance
Innovation

Methods, ideas, or system contributions that make the work stand out.

DNN-free graph-constrained optimization using spectral theory
Attenuated Phase-Decoupled CFD captures full distribution
Progressive frequency scheduling from low to high
🔎 Similar Papers
No similar papers found.
Jin Cui
Jin Cui
Principal Engineer
Embedded SystemOS Kernel & DriverHypervisor & VirtualizationComputer uArch modellingFPGA & EDA
B
Boran Zhao
School of Software Engineering, Xi’an Jiaotong University
J
Jiajun Xu
School of Software Engineering, Xi’an Jiaotong University
J
Jiaqi Guo
School of Mathematical Sciences, Nankai University
S
Shuo Guan
School of Software Engineering, Xi’an Jiaotong University
Pengju Ren
Pengju Ren
Professor, Xi'an Jiaotong University