🤖 AI Summary
Cross-domain few-shot segmentation (CD-FSS) suffers from poor generalization and challenging adaptation to novel domains due to entangled feature representations. To address this, we propose a novel feature disentanglement paradigm: first, adversarial contrastive learning separates class-specific features from domain-invariant ones; second, a matrix-guided dynamic fusion mechanism adaptively integrates class and domain information while preserving spatial structural consistency. Our method synergistically combines contrastive learning, adversarial learning, cross-domain adaptive modulation, and spatially guided multi-branch feature integration. Evaluated on four mainstream benchmarks—PASCAL-5i, COCO-20i, FC4, and ISIC—we achieve significant improvements over existing state-of-the-art methods. Notably, our approach sets new international benchmarks in both cross-domain generalization and few-shot adaptation capability.
📝 Abstract
Cross-domain few-shot segmentation (CD-FSS) aims to tackle the dual challenge of recognizing novel classes and adapting to unseen domains with limited annotations. However, encoder features often entangle domain-relevant and category-relevant information, limiting both generalization and rapid adaptation to new domains. To address this issue, we propose a Divide-and-Conquer Decoupled Network (DCDNet). In the training stage, to tackle feature entanglement that impedes cross-domain generalization and rapid adaptation, we propose the Adversarial-Contrastive Feature Decomposition (ACFD) module. It decouples backbone features into category-relevant private and domain-relevant shared representations via contrastive learning and adversarial learning. Then, to mitigate the potential degradation caused by the disentanglement, the Matrix-Guided Dynamic Fusion (MGDF) module adaptively integrates base, shared, and private features under spatial guidance, maintaining structural coherence. In addition, in the fine-tuning stage, to enhanced model generalization, the Cross-Adaptive Modulation (CAM) module is placed before the MGDF, where shared features guide private features via modulation ensuring effective integration of domain-relevant information. Extensive experiments on four challenging datasets show that DCDNet outperforms existing CD-FSS methods, setting a new state-of-the-art for cross-domain generalization and few-shot adaptation.