🤖 AI Summary
Generalized Category Discovery (GCD) aims to jointly recognize both known and unknown categories, yet existing methods predominantly optimize objective functions without modeling the human cognitive decomposition process. This paper proposes ConGCD, the first GCD framework grounded in cognitive principles: it employs a self-decomposition mechanism to factor images into visual primitives and reconstruct high-order semantic representations; introduces leader and contextual consensus units, coordinated by a dynamic consensus scheduler for multi-path consensus integration; and unifies primitive binding, distribution-invariant modeling, and semantic reconstruction. Evaluated on both coarse- and fine-grained benchmarks, ConGCD significantly outperforms state-of-the-art methods, demonstrating the effectiveness and interpretability of consensus-aware learning in GCD.
📝 Abstract
Human perceptual systems excel at inducing and recognizing objects across both known and novel categories, a capability far beyond current machine learning frameworks. While generalized category discovery (GCD) aims to bridge this gap, existing methods predominantly focus on optimizing objective functions. We present an orthogonal solution, inspired by the human cognitive process for novel object understanding: decomposing objects into visual primitives and establishing cross-knowledge comparisons. We propose ConGCD, which establishes primitive-oriented representations through high-level semantic reconstruction, binding intra-class shared attributes via deconstruction. Mirroring human preference diversity in visual processing, where distinct individuals leverage dominant or contextual cues, we implement dominant and contextual consensus units to capture class-discriminative patterns and inherent distributional invariants, respectively. A consensus scheduler dynamically optimizes activation pathways, with final predictions emerging through multiplex consensus integration. Extensive evaluations across coarse- and fine-grained benchmarks demonstrate ConGCD's effectiveness as a consensus-aware paradigm. Code is available at github.com/lytang63/ConGCD.