🤖 AI Summary
Existing Novel Class Discovery (NCD) methods suffer significant performance degradation in cross-domain settings due to distribution shift between labeled source data and unlabeled novel classes. This work identifies style information as a critical confounder in cross-domain NCD and theoretically establishes style disentanglement as a necessary condition. To address this, we propose an “exclusive style removal” mechanism, implemented via a plug-and-play feature disentanglement module jointly optimized with contrastive learning–driven semantic consistency constraints. Our approach explicitly decouples domain-specific stylistic variations while preserving discriminative semantic features. We further construct a multi-backbone fair benchmark and evaluate on three standard cross-domain NCD datasets, achieving average accuracy improvements of 5.2%–9.7%. Results demonstrate that style removal enhances the effectiveness, robustness, and generalizability of out-of-distribution novel class clustering.
📝 Abstract
As a promising field in open-world learning, extit{Novel Class Discovery} (NCD) is usually a task to cluster unseen novel classes in an unlabeled set based on the prior knowledge of labeled data within the same domain. However, the performance of existing NCD methods could be severely compromised when novel classes are sampled from a different distribution with the labeled ones. In this paper, we explore and establish the solvability of NCD in cross domain setting with the necessary condition that style information must be removed. Based on the theoretical analysis, we introduce an exclusive style removal module for extracting style information that is distinctive from the baseline features, thereby facilitating inference. Moreover, this module is easy to integrate with other NCD methods, acting as a plug-in to improve performance on novel classes with different distributions compared to the seen labeled set. Additionally, recognizing the non-negligible influence of different backbones and pre-training strategies on the performance of the NCD methods, we build a fair benchmark for future NCD research. Extensive experiments on three common datasets demonstrate the effectiveness of our proposed module.