🤖 AI Summary
This work proposes a foreground-aware dynamic distillation method to address key limitations in existing dataset distillation approaches, which often suffer from high computational costs, distorted synthetic images, or loss of critical foreground information due to fixed image patch strategies. Leveraging Grounded SAM2 to estimate foreground proportions, the method introduces a class-adaptive dynamic selection mechanism that intelligently decides between preserving full images and sampling informative patches. Evaluated across multiple benchmarks, the approach significantly outperforms current state-of-the-art techniques, yielding compact synthetic datasets that are more informative and exhibit stronger generalization. Moreover, the distilled data demonstrate enhanced robustness in cross-architecture transfer and complex image composition scenarios.
📝 Abstract
In this paper, we propose a foreground-aware dataset distillation method that enhances patch selection in a content-adaptive manner. With the rising computational cost of training large-scale deep models, dataset distillation has emerged as a promising approach for constructing compact synthetic datasets that retain the knowledge of their large original counterparts. However, traditional optimization-based methods often suffer from high computational overhead, memory constraints, and the generation of unrealistic, noise-like images with limited architectural generalization. Recent non-optimization methods alleviate some of these issues by constructing distilled data from real image patches, but the used rigid patch selection strategies can still discard critical information about the main objects. To solve this problem, we first leverage Grounded SAM2 to identify foreground objects and compute per-image foreground occupancy, from which we derive a category-wise patch decision threshold. Guided by these thresholds, we design a dynamic patch selection strategy that, for each image, either selects the most informative patch from multiple candidates or directly resizes the full image when the foreground dominates. This dual-path mechanism preserves more key information about the main objects while reducing redundant background content. Extensive experiments on multiple benchmarks show that the proposed method consistently improves distillation performance over existing approaches, producing more informative and representative distilled datasets and enhancing robustness across different architectures and image compositions.