🤖 AI Summary
Generalized Category Discovery (GCD) aims to jointly recognize both known and novel, previously unseen classes from partially labeled data, without prior knowledge of the number of unknown categories. Conventional approaches rely on rigid assumptions—such as pre-specified numbers of classes—limiting their adaptability in open-world scenarios. This paper proposes AdaGCD, a novel framework centered on Adaptive Slot Attention (AdaSlot), which dynamically infers and allocates clustering capacity without requiring a fixed number of categories. Furthermore, we introduce a clustering-centric contrastive learning paradigm that jointly models instance-specific semantics and spatially local structural features. Extensive experiments on standard and fine-grained GCD benchmarks demonstrate significant improvements over state-of-the-art methods, validating AdaGCD’s dual strengths: flexible open-set clustering capability and enhanced local representation learning.
📝 Abstract
Generalized Category Discovery (GCD) tackles the challenging problem of categorizing unlabeled images into both known and novel classes within a partially labeled dataset, without prior knowledge of the number of unknown categories. Traditional methods often rely on rigid assumptions, such as predefining the number of classes, which limits their ability to handle the inherent variability and complexity of real-world data. To address these shortcomings, we propose AdaGCD, a cluster-centric contrastive learning framework that incorporates Adaptive Slot Attention (AdaSlot) into the GCD framework. AdaSlot dynamically determines the optimal number of slots based on data complexity, removing the need for predefined slot counts. This adaptive mechanism facilitates the flexible clustering of unlabeled data into known and novel categories by dynamically allocating representational capacity. By integrating adaptive representation with dynamic slot allocation, our method captures both instance-specific and spatially clustered features, improving class discovery in open-world scenarios. Extensive experiments on public and fine-grained datasets validate the effectiveness of our framework, emphasizing the advantages of leveraging spatial local information for category discovery in unlabeled image datasets.