🤖 AI Summary
Single-domain generalization (SDG) suffers from performance degradation due to distribution shift between synthetically generated data and the authentic target domain. To address this, we propose a discriminative domain recombination and soft fusion framework: first, diverse pseudo-target-domain samples are synthesized via latent diffusion models (LDMs); second, an entropy-guided channel decoupling mechanism suppresses synthetic noise, while latent-space adversarial interpolation enables continuous soft fusion across multiple pseudo-domains. Crucially, our method operates without access to real target-domain data, effectively mitigating synthetic-to-real distribution discrepancy. Evaluated on SDG benchmarks for object detection and semantic segmentation, it achieves substantial improvements in generalization performance with low computational overhead and seamless plug-and-play transferability to unsupervised domain adaptation. Our core contribution lies in being the first to introduce discriminative feature recombination and latent-space soft fusion into SDG—establishing a novel paradigm for robust, synthetic-data-driven generalization.
📝 Abstract
Single Domain Generalization (SDG) aims to train models with consistent performance across diverse scenarios using data from a single source. While using latent diffusion models (LDMs) show promise in augmenting limited source data, we demonstrate that directly using synthetic data can be detrimental due to significant feature distribution discrepancies between synthetic and real target domains, leading to performance degradation. To address this issue, we propose Discriminative Domain Reassembly and Soft-Fusion (DRSF), a training framework leveraging synthetic data to improve model generalization. We employ LDMs to produce diverse pseudo-target domain samples and introduce two key modules to handle distribution bias. First, Discriminative Feature Decoupling and Reassembly (DFDR) module uses entropy-guided attention to recalibrate channel-level features, suppressing synthetic noise while preserving semantic consistency. Second, Multi-pseudo-domain Soft Fusion (MDSF) module uses adversarial training with latent-space feature interpolation, creating continuous feature transitions between domains. Extensive SDG experiments on object detection and semantic segmentation tasks demonstrate that DRSF achieves substantial performance gains with only marginal computational overhead. Notably, DRSF's plug-and-play architecture enables seamless integration with unsupervised domain adaptation paradigms, underscoring its broad applicability in addressing diverse and real-world domain challenges.