Foreground-Aware Dataset Distillation via Dynamic Patch Selection

📅 2026-01-06
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work proposes a foreground-aware dynamic distillation method to address key limitations in existing dataset distillation approaches, which often suffer from high computational costs, distorted synthetic images, or loss of critical foreground information due to fixed image patch strategies. Leveraging Grounded SAM2 to estimate foreground proportions, the method introduces a class-adaptive dynamic selection mechanism that intelligently decides between preserving full images and sampling informative patches. Evaluated across multiple benchmarks, the approach significantly outperforms current state-of-the-art techniques, yielding compact synthetic datasets that are more informative and exhibit stronger generalization. Moreover, the distilled data demonstrate enhanced robustness in cross-architecture transfer and complex image composition scenarios.

Technology Category

Application Category

📝 Abstract
In this paper, we propose a foreground-aware dataset distillation method that enhances patch selection in a content-adaptive manner. With the rising computational cost of training large-scale deep models, dataset distillation has emerged as a promising approach for constructing compact synthetic datasets that retain the knowledge of their large original counterparts. However, traditional optimization-based methods often suffer from high computational overhead, memory constraints, and the generation of unrealistic, noise-like images with limited architectural generalization. Recent non-optimization methods alleviate some of these issues by constructing distilled data from real image patches, but the used rigid patch selection strategies can still discard critical information about the main objects. To solve this problem, we first leverage Grounded SAM2 to identify foreground objects and compute per-image foreground occupancy, from which we derive a category-wise patch decision threshold. Guided by these thresholds, we design a dynamic patch selection strategy that, for each image, either selects the most informative patch from multiple candidates or directly resizes the full image when the foreground dominates. This dual-path mechanism preserves more key information about the main objects while reducing redundant background content. Extensive experiments on multiple benchmarks show that the proposed method consistently improves distillation performance over existing approaches, producing more informative and representative distilled datasets and enhancing robustness across different architectures and image compositions.
Problem

Research questions and friction points this paper is trying to address.

dataset distillation
foreground-aware
patch selection
synthetic dataset
object preservation
Innovation

Methods, ideas, or system contributions that make the work stand out.

dataset distillation
foreground-aware
dynamic patch selection
Grounded SAM2
content-adaptive
🔎 Similar Papers
No similar papers found.
L
Longzhen Li
Graduate School of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-Ku, Sapporo, 060-0814, Japan; Education and Research Center for Mathematical and Data Science, Hokkaido University, N-12, W-7, Kita-Ku, Sapporo, 060-0812, Japan
Guang Li
Guang Li
Assistant Professor, Hokkaido University
Dataset DistillationSelf-Supervised LearningData-Centric AIMedical Image Analysis
Ren Togo
Ren Togo
Hokkaido University
AIdeep learningmachine learningcomputer visionmedical image analysis
Keisuke Maeda
Keisuke Maeda
Hokkaido University
AIDeep learningMultimedia signal processingimage processingmachine learning
Takahiro Ogawa
Takahiro Ogawa
Hokkaido University
Multimedia ProcessingAIIoTBig Data Analysis
M
M. Haseyama
Graduate School of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-Ku, Sapporo, 060-0814, Japan