🤖 AI Summary
This paper addresses Generalized Category Discovery (GCD), the task of jointly identifying both known and unknown categories in images under partial supervision. Existing approaches suffer from limited generalizability due to reliance on single-level semantics or manually designed hierarchies. To overcome this, we propose SEAL, a Semantic-aware Hierarchical learning framework. Its key contributions are: (1) hierarchical semantic-guided soft contrastive learning, which leverages natural taxonomic relationships to generate informative soft negative samples; and (2) a cross-granularity consistency module that enforces alignment between fine-grained and coarse-grained predictions to improve semantic coherence. SEAL achieves state-of-the-art performance on fine-grained benchmarks—including SSB, Oxford-Pets, and Herbarium19—and demonstrates strong generalization to coarse-grained datasets, validating its robustness across semantic granularities.
📝 Abstract
This paper investigates the problem of Generalized Category Discovery (GCD). Given a partially labelled dataset, GCD aims to categorize all unlabelled images, regardless of whether they belong to known or unknown classes. Existing approaches typically depend on either single-level semantics or manually designed abstract hierarchies, which limit their generalizability and scalability. To address these limitations, we introduce a SEmantic-aware hierArchical Learning framework (SEAL), guided by naturally occurring and easily accessible hierarchical structures. Within SEAL, we propose a Hierarchical Semantic-Guided Soft Contrastive Learning approach that exploits hierarchical similarity to generate informative soft negatives, addressing the limitations of conventional contrastive losses that treat all negatives equally. Furthermore, a Cross-Granularity Consistency (CGC) module is designed to align the predictions from different levels of granularity. SEAL consistently achieves state-of-the-art performance on fine-grained benchmarks, including the SSB benchmark, Oxford-Pet, and the Herbarium19 dataset, and further demonstrates generalization on coarse-grained datasets. Project page: https://visual-ai.github.io/seal/