DAM: Domain-Aware Module for Multi-Domain Dataset Condensation

📅 2025-05-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing dataset condensation (DC) methods overlook the multi-domain heterogeneity of modern datasets, leading to poor cross-domain generalization. This paper proposes Multi-Domain Dataset Condensation (MDDC), a novel paradigm unifying compression and generalization across both single- and multi-domain settings. Our approach introduces two key innovations: (1) a Domain-Aware Module (DAM) that implicitly models domain-specific features via learnable spatial masks; and (2) a pseudo-domain labeling method grounded in frequency-domain statistics, eliminating the need for ground-truth domain annotations. By integrating frequency-domain analysis, synthetic image optimization, and modular training-time design, MDDC achieves significant improvements over state-of-the-art methods across intra-domain, out-of-domain, and cross-architecture evaluations—while strictly adhering to the image-per-class (IPC) constraint.

Technology Category

Application Category

📝 Abstract
Dataset Condensation (DC) has emerged as a promising solution to mitigate the computational and storage burdens associated with training deep learning models. However, existing DC methods largely overlook the multi-domain nature of modern datasets, which are increasingly composed of heterogeneous images spanning multiple domains. In this paper, we extend DC and introduce Multi-Domain Dataset Condensation (MDDC), which aims to condense data that generalizes across both single-domain and multi-domain settings. To this end, we propose the Domain-Aware Module (DAM), a training-time module that embeds domain-related features into each synthetic image via learnable spatial masks. As explicit domain labels are mostly unavailable in real-world datasets, we employ frequency-based pseudo-domain labeling, which leverages low-frequency amplitude statistics. DAM is only active during the condensation process, thus preserving the same images per class (IPC) with prior methods. Experiments show that DAM consistently improves in-domain, out-of-domain, and cross-architecture performance over baseline dataset condensation methods.
Problem

Research questions and friction points this paper is trying to address.

Addresses multi-domain dataset condensation challenges
Proposes Domain-Aware Module for cross-domain generalization
Enhances performance without explicit domain labels
Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces Domain-Aware Module for multi-domain condensation
Uses frequency-based pseudo-domain labeling without explicit labels
Embeds domain features via learnable spatial masks
🔎 Similar Papers
No similar papers found.