Dissecting Generalized Category Discovery: Multiplex Consensus under Self-Deconstruction

📅 2025-08-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Generalized Category Discovery (GCD) aims to jointly recognize both known and unknown categories, yet existing methods predominantly optimize objective functions without modeling the human cognitive decomposition process. This paper proposes ConGCD, the first GCD framework grounded in cognitive principles: it employs a self-decomposition mechanism to factor images into visual primitives and reconstruct high-order semantic representations; introduces leader and contextual consensus units, coordinated by a dynamic consensus scheduler for multi-path consensus integration; and unifies primitive binding, distribution-invariant modeling, and semantic reconstruction. Evaluated on both coarse- and fine-grained benchmarks, ConGCD significantly outperforms state-of-the-art methods, demonstrating the effectiveness and interpretability of consensus-aware learning in GCD.

Technology Category

Application Category

📝 Abstract
Human perceptual systems excel at inducing and recognizing objects across both known and novel categories, a capability far beyond current machine learning frameworks. While generalized category discovery (GCD) aims to bridge this gap, existing methods predominantly focus on optimizing objective functions. We present an orthogonal solution, inspired by the human cognitive process for novel object understanding: decomposing objects into visual primitives and establishing cross-knowledge comparisons. We propose ConGCD, which establishes primitive-oriented representations through high-level semantic reconstruction, binding intra-class shared attributes via deconstruction. Mirroring human preference diversity in visual processing, where distinct individuals leverage dominant or contextual cues, we implement dominant and contextual consensus units to capture class-discriminative patterns and inherent distributional invariants, respectively. A consensus scheduler dynamically optimizes activation pathways, with final predictions emerging through multiplex consensus integration. Extensive evaluations across coarse- and fine-grained benchmarks demonstrate ConGCD's effectiveness as a consensus-aware paradigm. Code is available at github.com/lytang63/ConGCD.
Problem

Research questions and friction points this paper is trying to address.

Bridges gap between human and machine category discovery
Decomposes objects into visual primitives for understanding
Captures class-discriminative patterns via consensus units
Innovation

Methods, ideas, or system contributions that make the work stand out.

Primitive-oriented representations via semantic reconstruction
Dominant and contextual consensus units integration
Dynamic consensus scheduler optimizes activation pathways
🔎 Similar Papers
No similar papers found.
Luyao Tang
Luyao Tang
HKU
Machine LearningOpen-World LearningGeneralized Category DiscoveryMedical AI
Kunze Huang
Kunze Huang
Xiamen University
Machine Learning
Chaoqi Chen
Chaoqi Chen
Shenzhen University
Machine LearningComputer VisionTrustworthy AIData-centric AI
Y
Yuxuan Yuan
Key Laboratory of Multimedia Trusted Perception and Efficient Computing, Ministry of Education of China, Xiamen University; School of Informatics, Xiamen University
Chenxin Li
Chenxin Li
The Chinese University of Hong Kong
Multimodal LLMAgentWorld Model
X
Xiaotong Tu
Key Laboratory of Multimedia Trusted Perception and Efficient Computing, Ministry of Education of China, Xiamen University; School of Informatics, Xiamen University
Xinghao Ding
Xinghao Ding
Unknown affiliation
Y
Yue Huang
Key Laboratory of Multimedia Trusted Perception and Efficient Computing, Ministry of Education of China, Xiamen University; School of Informatics, Xiamen University