Contrastive Learning-Enhanced Trajectory Matching for Small-Scale Dataset Distillation

πŸ“… 2025-05-21
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Existing trajectory-matching data distillation methods suffer from low semantic fidelity and insufficient diversity under extreme data scarcity (e.g., only 10–50 images). Method: This paper proposes the first end-to-end differentiable, lightweight distillation framework that integrates SimCLR-style contrastive learning into gradient-matching trajectory optimization. It jointly optimizes synthetic samples using a lightweight CNN encoder, differentiable image synthesis, and contrastive constraints. Results: Evaluated on CIFAR-10/100, the method significantly enhances the discriminability and visual quality of synthetic data: it improves classification accuracy over state-of-the-art methods by 5–12%, reduces FrΓ©chet Inception Distance (FID) by 37%, and increases training trajectory similarity by 2.1Γ—. The resulting compact, high-fidelity synthetic datasets enable efficient model deployment on resource-constrained edge devices and accelerate prototyping in low-data regimes.

Technology Category

Application Category

πŸ“ Abstract
Deploying machine learning models in resource-constrained environments, such as edge devices or rapid prototyping scenarios, increasingly demands distillation of large datasets into significantly smaller yet informative synthetic datasets. Current dataset distillation techniques, particularly Trajectory Matching methods, optimize synthetic data so that the model's training trajectory on synthetic samples mirrors that on real data. While demonstrating efficacy on medium-scale synthetic datasets, these methods fail to adequately preserve semantic richness under extreme sample scarcity. To address this limitation, we propose a novel dataset distillation method integrating contrastive learning during image synthesis. By explicitly maximizing instance-level feature discrimination, our approach produces more informative and diverse synthetic samples, even when dataset sizes are significantly constrained. Experimental results demonstrate that incorporating contrastive learning substantially enhances the performance of models trained on very small-scale synthetic datasets. This integration not only guides more effective feature representation but also significantly improves the visual fidelity of the synthesized images. Experimental results demonstrate that our method achieves notable performance improvements over existing distillation techniques, especially in scenarios with extremely limited synthetic data.
Problem

Research questions and friction points this paper is trying to address.

Distilling large datasets into small informative synthetic sets
Preserving semantic richness under extreme sample scarcity
Enhancing model performance on very small-scale synthetic datasets
Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates contrastive learning for image synthesis
Enhances feature discrimination in synthetic samples
Improves visual fidelity with limited dataset sizes
πŸ”Ž Similar Papers
No similar papers found.
W
Wenmin Li
Graduate School of Engineering, University of Fukui
Shunsuke Sakai
Shunsuke Sakai
University of Fukui
Computer VisionNeural NetworksAnomaly Detection
T
Tatsuhito Hasegawa
Graduate School of Engineering, University of Fukui