🤖 AI Summary
Generalized Category Discovery (GCD) aims to jointly cluster unlabeled data—containing both known and novel classes—by leveraging labeled data from known classes; its core challenge lies in accuracy imbalance caused by distributional ambiguity between known and novel classes. This paper proposes a Joint Prototype Learning (JPL) framework: (1) unifying prototype modeling for both known and novel classes to eliminate classifier bias; (2) introducing a two-level adaptive pseudo-labeling mechanism to mitigate confirmation bias; and (3) integrating contrastive regularization and clustering consistency constraints to align feature representations with clustering objectives, further enhanced by novel-class cardinality estimation and outlier detection for task-level co-optimization. Evaluated on both generic and fine-grained benchmarks, JPL achieves state-of-the-art performance, significantly improving balanced accuracy across known and novel classes while enhancing representation discriminability.
📝 Abstract
Generalized category discovery (GCD) is a pragmatic but underexplored problem, which requires models to automatically cluster and discover novel categories by leveraging the labeled samples from old classes. The challenge is that unlabeled data contain both old and new classes. Early works leveraging pseudo-labeling with parametric classifiers handle old and new classes separately, which brings about imbalanced accuracy between them. Recent methods employing contrastive learning neglect potential positives and are decoupled from the clustering objective, leading to biased representations and sub-optimal results. To address these issues, we introduce a unified and unbiased prototype learning framework, namely ProtoGCD, wherein old and new classes are modeled with joint prototypes and unified learning objectives, enabling unified modeling between old and new classes. Specifically, we propose a dual-level adaptive pseudo-labeling mechanism to mitigate confirmation bias, together with two regularization terms to collectively help learn more suitable representations for GCD. Moreover, for practical considerations, we devise a criterion to estimate the number of new classes. Furthermore, we extend ProtoGCD to detect unseen outliers, achieving task-level unification. Comprehensive experiments show that ProtoGCD achieves state-of-the-art performance on both generic and fine-grained datasets.